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SURFACE EXPRESSION LIBRARIES 
OF RANDOMIZED PEPTIDES 

This application is\a continuation-in-part of U.S. 
<J Serial No. 07/590,664, fi^d on September 28, 1990. 

5 BACKGROUND OF THE INVENTION 

This invention relates generally to methods for 
synthesizing and expressing oligonucleotides and, more 
particularly, to methods for expressing oligonucleotides 
having random codon sequences . 

10 Oligonucleotide synthesis proceeds via linear 

coupling of individual monomers in a stepwise reaction. 
The reactions are generally performed on a solid phase 
support by first coupling the 3 ! end of the first monomer 
to the support. The second monomer is added to the 5 1 

15 end of the first monomer in a condensation reaction to 
yield a dinucleotide coupled to the solid support. At 
the end of each coupling reaction, the by-products and 
unreacted, free monomers are washed away so that the 
starting material for the next round of synthesis is the 

20 pure oligonucleotide attached to the support. In this 
reaction scheme, the stepwise addition of individual 
monomers to a single, growing end of a oligonucleotide 
ensures accurate synthesis of the desired sequence. 
Moreover, unwanted side reactions are eliminated, such as 

25 the condensation of two oligonucleotides, resulting in 
high product yields. 

In some instances, it is desired that synthetic 
oligonucleotides have random nucleotide sequences. This 
result can be accomplished by adding equal proportions of 
3 0 all four nucleotides in the monomer coupling reactions, 
leading to the random incorporation of all nucleotides 
and yielding a population of oligonucleotides with random 



sequences. Since all possible combinations of nucleotide 
sequences are represented within the population, all 
possible codon triplets will also be represented. If the 
objective is ultimately to generate random peptide 
5 products, this approach has a severe limitation because 
the random codons synthesized will bias the amino acids 
incorporated during translation of the DNA by the cell 
into polypeptides. 

The bias is due to the redundancy of the genetic 
10 code. There are four nucleotide monomers which leads to 
sixty- four possible triplet codons. With only twenty 
amino acids to specify, many of the amino acids are 
encoded by multiple codons. Therefore, a population of 
fl? oligonucleotides synthesized by sequential addition of 

M 15 monomers from a random population will not encode 

i* r peptides whose amino acid sequence represents all 

N= possible combinations of the twenty different amino acids 

in equal proportions. That is, the frequency of amino 
acids incorporated into polypeptides will be biased 
20 toward those amino acids which are specified by multiple 
codons . 
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To alleviate amino acid bias due to the redundancy 
of the genetic code, the oligonucleotides can be 
synthesized from nucleotide triplets. Here, a triplet 

25 coding for each of the twenty amino acids is synthesized 
from individual monomers. Once synthesized, the triplets 
are used in the coupling reactions instead of individual 
monomers. By mixing equal proportions of the triplets, 
synthesis of oligonucleotides with random codons can be 

30 accomplished. However, the cost of synthesis from such 
triplets far exceeds that of synthesis from individual 
monomers because triplets are not commercially available. 



Amino acid bias can be reduced, however, by 
synthesizing the degenerate codon sequence NNK where N is 
a mixture of all four nucleotides and K is a mixture 
guanine and thymine nucleotides. Each position within an 
oligonucleotide having this codon sequence will contain a 
total of 32 codons (12 encoding amino acids being 
represented once, 5 represented twice, 3 represented 
three times and one codon being a stop codon) . 
Oligonucleotides expressed with such degenerate codon 
sequences will produce peptide products whose sequences 
are biased toward those amino acids being represented 
more than once. Thus, populations of peptides whose 
sequences are completely random cannot be obtained from 
oligonucleotides synthesized from degenerate sequences. 

There thus exists a need for a method to express 
oligonucleotides having a fully random or desirably 
biased sequence which alleviates genetic redundancy. The 
present invention satisfies these needs and provides 
additional advantages as well. 

SUMMARY OF THE INVENTION 

The invention provides a plurality of procaryotic 
cells containing a diverse population of expressible 
oligonucleotides operationally linked to expression 
elements, the expressible oligonucleotides having a 
desirable bias of random codon sequences. 

BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 is a schematic drawing for synthesizing 
oligonucleotides from nucleotide monomers with random* 
tuplets at each position using twenty reaction vessels. 
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Figure 2 is a schematic drawing for synthesizing 
oligonucleotides from nucleotide monomers with random 
tuplets at each position using ten reaction vessels. 

FigureV* is a schematic diagram of the two vectors 
5 used for subMbrary and library production from precursor 
oligonucleotide portions. M13IX22 (Figure 3A) is the 
vector used to Vlone the anti-sense precursor portions 
(hatched box) . The single-headed arrow represents the 
Lac p/o expression sequences and the double-headed arrow 

10 represents the poAion of M13IX22 which is to be combined 
with M13IX42. The Jamber stop codon for biological 
selection and relevant restriction sites are also shown. 
M13IX42 (Figure 3B) is; the vector used to clone the sense 
precursor portions (open box) . Thick lines represent the 

15 pseudo-wild type (ygVlVl) and wild type (gVIII) gene 
VIII sequences. The doulole -headed arrow represents the 
portion of M13IX42 which Vs to be combined with M13IX22. 
The two amber stop codons \nd relevant restriction sites 
are also shown. Figure 3C ishows the joining of vector 

20 population from sublibraries\ to form the functional 
surface expression vector M13YEX. Figure 3D shows the 
generation of a surface expression library in a non- 
suppressor strain and the production of phage. The phage 
are used to infect a suppressor ^train (Figure 3E) for 

25 surface expression and screening \of the library. 

Figure 4 is a schematic diagram of the vector used 
for generation of surface expression libraries from 
random oligonucleotide populations (M13IX3 0) . The 
symbols are as described for Figure 3. 

\ 

30 Figure 5 is the \jiucleo tide sequence of M13IX42 (SEQ 

ID NO: 1) . \ 



Lgure 6 is the nucleotide sequence of M13IX22 (SEQ 
ID NO: 2 s ! 

Figure \^is the nucleotide sequence of M13IX30 (SEQ 
ID NO: 3) 

v 

5 Figure 8 is trinucleotide sequence of M13ED03 (SEQ 

ID NO: 4) . 

Figure 9 is the nucle^ide sequence of M13 1X421 (SEQ 
ID NO: 5) 

Figure 10 is the nucleotide Sequence of M13ED04 (SEQ 
"10 ID NO: 6) . 

DETAILED DESCRIPTION OF THE INVENTION 

This invention is directed to a simple and 
inexpensive method for synthesizing and expressing 
oligonucleotides having a desirable bias of random codons 

15 using individual monomers. The method is advantageous in 
that individual monomers are used instead of triplets and 
by synthesizing only a non-degenerate subset of all 
triplets, codon redundancy is alleviated. Thus, the 
oligonucleotides synthesized represent a large proportion 

2 0 of possible random triplet sequences which can be 

obtained. The oligonucleotides can be expressed, for 
example, on the surface of filamentous bacteriophage in a 
form which does not alter phage viability or impose 
biological selections against certain peptide sequences. 

25 The oligonucleotides produced are therefore useful for 
generating an unlimited number of pharmacological and 
research products. 
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In one embodiment, the invention entails the 
sequential coupling of monomers to produce 
oligonucleotides with a desirable bias of random codons . 
The coupling reactions for the randomization of twenty 
5 codons which specify the amino acids of the genetic code 
are performed in ten different reaction vessels. Each 
reaction vessel contains a support on which the monomers 
for two different codons are coupled in three sequential 
reactions. One of the reactions couples an equal mixture 
10 of two monomers such that the final product has two 

different codon sequences. The codons are randomized by 
removing the supports from the reaction vessels and 
Q mixing them to produce a single batch of supports 

containing all twenty codons at a particular position. 
15 Synthesis at the next codon position proceeds by equally 
dividing the mixed batch of supports into ten reaction 
vessels as before and sequentially coupling the monomers 
N a for each pair of codons. The supports are again mixed to 

y randomize the codons at the position just synthesized. 

20 The cycle of coupling, mixing and dividing continues 
until the desired number of codon positions have been 
randomized. After the last position has been randomized, 



W the oligonucleotides with random codons are cleaved from 



the support. The random oligonucleotides can then be 
25 expressed, for example, on the surface of filamentous 
bacteriophage as gene VIII -peptide fusion proteins. 
Alternative genes can be used as well. 

In its broadest form, the invention provides a 
diverse population of synthetic oligonucleotides 

30 contained in vectors so as to be expressible in cells. 

Such populations of diverse oligonucleotides can be fully 
random at one or more codon sites or can be fully defined 
at one or more site, so long as at least one site the 
codons are randomly variable. The populations of 

35 oligonucleotides can be expressed as fusion products in 



combination with surface proteins of filamentous 
bacteriophage, such as M13, as with gene VIII. The 
vectors can be transfected into a plurality of cells, 
such as the procaryote E. coli . 

The diverse population of oligonucleotides can be 
formed by randomly combining first and second precursor 
populations, each precursor population having a desirable 
bias of random codon sequences. Methods of synthesizing 
and expressing the diverse population of expressible 
oligonucleotides are also provided. 

In a preferred embodiment, two populations of random 
oligonucleotides are synthesized. The oligonucleotides 
within each population encode a portion of the final 
oligonucleotide which is to be expressed. 

Oligonucleotides within one population encode the carboxy 
terminal portion of the expressed oligonucleotides. 
These oligonucleotides are cloned in frame with a gene 
VIII (gVIII) sequence so that translation of the sequence 
produces peptide fusion proteins. The second population 
of oligonucleotides are cloned into a separate vector. 
Each oligonucleotide within this population encodes the 
anti-sense of the amino terminal portion of the expressed 
oligonucleotides. This vector also contains the elements 
necessary for expression. The two vectors containing the 
random oligonucleotides are combined such that the two 
precursor oligonucleotide portions are joined together at 
random to form a population of larger oligonucleotides 
derived from two smaller portions. The vectors contain 
selectable markers to ensure maximum efficiency in 
joining together the two oligonucleotide populations. A 
mechanism also exists to control the expression of gVIII- 
peptide fusion proteins during library construction and 
screening . 
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As used herein, the term "monomer" or "nucleotide 
monomer" refers to individual nucleotides used in the 
chemical synthesis of oligonucleotides. Monomers that 
can be used include both the ribo- and deoxyribo- forms 
5 of each of the five standard nucleotides (derived from 
the bases adenine (A or dA, respectively) , guanine (G or 
dG) , cytosine (C or dC) , thymine (T) and uracil (U) ) . 
Derivatives and precursors of bases such as inosine which 
are capable of supporting polypeptide biosynthesis are 
10 also included as monomers. Also included are chemically 
modified nucleotides, for example, one having a 
reversible blocking agent attached to any of the 
£} positions on the purine or pyrimidine bases, the ribose 

^; or deoxyribose sugar or the phosphate or hydroxyl 

ill 15 moieties of the monomer. Such blocking groups include, 

y\ for .example, dimethoxytrityl , benzoyl, isobutyryl, beta- 

|T cyanoethyl and diisopropylamine groups, and are used to 

H : protect hydroxyls, exocyclic amines and phosphate 

moieties. Other blocking agents can also be used and are 
N: 20 known to one skilled in the art. 



f»j As used herein, the term "tuplet" refers to a group 

Cf of elements of a definable size. The elements of a 

tuplet as used herein are nucleotide monomers. For 
example, a tuplet can be a dinucleotide , a trinucleotide 
25 or can also be four or more nucleotides. 

As used herein, the term "codon" or "triplet" refers 
to a tuplet consisting of three adjacent nucleotide 
monomers which specify one of the twenty naturally 
occurring amino acids found in polypeptide biosynthesis. 
3 0 The term also includes nonsense, or stop, codons which do 
not specify any amino acid. 



9 



"Random codons" or "randomized codons, " as used 
herein, refers to more than one codon at a position 
within a collection of oligonucleotides. The number of 
different codons can be from two to twenty at any 
5 particular position. "Randomized oligonucleotides," as 
used herein, refers to a collection of oligonucleotides 
with random codons at one or more positions. "Random 
codon sequences" as used herein means that more than one 
codon position within a randomized oligonucleotide 
10 contains random codons. For example, if randomized 

oligonucleotides are six nucleotides in length (i.e., two 
codons) and both the first and second codon positions are 
Q randomized to encode all twenty amino acids, then a 

population of oligonucleotides having random codon 
til 15 sequences with every possible combination of the twenty 

triplets in the first and second position makes up the 
^ ! * above population of randomized oligonucleotides. The 

M« number of possible codon combinations is 2 0 2 . Likewise, 

3 if randomized oligonucleotides of fifteen nucleotides in 

y : 2 0 length are synthesized which have random codon sequences 

Si' a 

iO at all positions encoding all twenty amino acids, then 

Si all triplets coding for each of the twenty amino acids 

Q will be found in equal proportions at every position. 

The population constituting the randomized 
25 oligonucleotides will contain 20 15 different possible 
species of oligonucleotides. "Random tuplets," or 
"randomized tuplets" are defined analogously. 

As used herein, the term "bias" refers to a 
preference. It is understood that there can be degrees 

3 0 of preference or bias toward codon sequences which encode 
particular amino acids. For example, an oligonucleotide 
whose codon sequences do not preferably encode particular 
amino acids is unbiased and therefore completely random. 
The oligonucleotide codon sequences can also be biased 

35 toward predetermined codon sequences or codon frequencies 
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and while still diverse and random, will exhibit codon 
sequences biased toward a defined, or preferred, 
sequence. "A desirable bias of random codon sequences'* 
as used herein, refers to the predetermined degree of 
bias which can be selected from totally random to 
essentially, but not totally, defined (or preferred) . 
There must be at least one codon position which is 
variable , however . 

As used herein, the term "support" refers to a solid 
phase material for attaching monomers for chemical 
synthesis. Such support is usually composed of materials 
such as beads of control pore glass but can be other 
materials known to one skilled in the art. The term is 
also meant to include one or more monomers coupled to the 
support for additional oligonucleotide synthesis 
reactions . 

As used herein, the terms "coupling" or "condensing" 
refers to the chemical reactions for attaching one 
monomer to a second monomer or to a solid support. Such 
reactions are known to one skilled in the art and are 
typically performed on an automated DNA synthesizer such 
as a MilliGen/Biosearch Cyclone Plus Synthesizer using 
procedures recommended by the manufacturer. 
"Sequentially coupling" as used herein, refers to the 
stepwise addition of monomers. 

A method of synthesizing oligonucleotides having 
random tuplets using individual monomers is described. 
The method consists of several steps, the first being 
synthesis of a nucleotide tuplet for each tuplet to be 
randomized. As described here and below, a nucleotide 
triplet (i.e., a codon) will be used as a specific 
example of a tuplet. Any size tuplet will work using the 
methods disclosed herein, and one skilled in the art 
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would know how to use the methods to randomize tuplets of 
any size. 

If the randomization of codons specifying all twenty 
amino acids is desired at a position, then twenty 
5 different codons are synthesized. Likewise, if 

randomization of only ten codons at a particular position 
is desired then those ten codons are synthesized. 
Randomization of codons from two to sixty- four can be 
accomplished by synthesizing each desired triplet. 
10 Preferably, randomization of from two to twenty codons is 
used for any one position because of the redundancy of 
Q the genetic code. The codons selected at one position do 

not have to be the same codons selected at the next 
position. Additionally, the sense or anti-sense sequence 
15 oligonucleotide can be synthesized. The process 

therefore provides for randomization of any desired codon 
position with any number of codons. 
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M= Codons to be randomized are synthesized sequentially 

*** by coupling the first monomer of each codon to separate 

Ms* 

#«| 20 supports. The supports for the synthesis of each codon 

can, for example, be contained in different reaction 
vessels such that one reaction vessel corresponds to the 
monomer coupling reactions for one codon. As will be 
used here and below, if twenty codons are to be 
25 randomized, then twenty reaction vessels can be used in 
independent coupling reactions for the first twenty 
monomers of each codon. Synthesis proceeds by 
sequentially coupling the second monomer of each codon to 
the first monomer to produce a dimer, followed by 
3 0 coupling the third monomer for each codon to each of the 
above -synthesized dimers to produce a trimer (Figure 1, 
step l r where M 1# M 2 and M 3 represent the first, second 
and third monomer, respectively, for each codon to be 
randomized) . 
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Following synthesis of the first codons from 
individual monomers, the randomization is achieved by 
mixing the supports from all twenty reaction vessels 
which contain the individual codons to be randomized. 
5 The solid phase support can be removed from its vessel 
and mixed to achieve a random distribution of all codon 
species within the population (Figure 1, step 2) . The 
mixed population of supports, constituting all codon 
species, are then redistributed into twenty independent 
10 reaction vessels (Figure 1, step 3) . The resultant 

vessels are all identical and contain equal portions of 
all twenty codons coupled to a solid phase support. 

^ For randomization of the second position codon, 

synthesis of twenty additional codons is performed in 
%: 15 each of the twenty reaction vessels produced in step 3 as 

p ? the condensing substrates of step 1 (Figure 1, step 4). 

M.- Steps 1 and 4 are therefore equivalent except that step 4 

f uses the supports produced by the previous synthesis 

y ; cycle (steps 1 through 3) for codon synthesis whereas 

fU 2 0 step 1 is the initial synthesis of the first codon in the 

]d oligonucleotide. The supports resulting from step 4 will 

Q each have two codons attached to them (i.e., a 

hexanucleotide) with the codon at the first position 
being any one of twenty possible codons (i.e., random) 
25 and the codon at the second position being one of the 
twenty possible codons. 

For randomization of the codon at the second 
position and synthesis of the third position codon, steps 
2 through 4 are again repeated. This process yields in 
30 each vessel a three codon oligonucleotide (i.e., 9 

nucleotides) with codon positions 1 and 2 randomized and 
position three containing one of the twenty possible 
codons. Steps 2 through 4 are repeated to randomize the 
third position codon and synthesize the codon at the next 
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position. The process is continued until an 
oligonucleotide of the desired length is achieved. After 
the final randomization step, the oligonucleotide can be 
cleaved from the supports and isolated by methods known 
5 to one skilled in the art. Alternatively, the 

oligonucleotides can remain on the supports for use in 
methods employing probe hybridization. 

The diversity of codon sequences, i.e., the number 
of different possible oligonucleotides, which can be 

10 obtained using the methods of the present invention, is 
extremely large and only limited by the physical 
characteristics of available materials. For example, a 
support composed of beads of about 100 /xm in diameter 
will be limited to about 10,000 beads/reaction vessel 

15 using a 1 /xM reaction vessel containing 25 mg of beads. 
This size bead can support about 1 x 10 7 oligonucleotides 
per bead. Synthesis using separate reaction vessels for 
each of the twenty amino acids will produce beads in 
which all the oligonucleotides attached to an individual 

20 bead are identical. The diversity which can be obtained 
under these conditions is approximately 10 7 copies of 
10,000 x 20 or 200,000 different random oligonucleotides. 
The diversity can be increased, however, in several ways 
without departing from the basic methods disclosed 

2 5 herein. For example, the number of possible sequences 

can be increased by decreasing the size of the individual 
beads which make up the support. A bead of about 3 0 fxm 
in diameter will increase the number of beads per 
reaction vessel and therefore the number of 

30 oligonucleotides synthesized. Another way to increase 
the diversity of oligonucleotides with random codons is 
to increase the volume of the reaction vessel. For 
example, using the same size bead, a larger volume can 
contain a greater number of beads than a smaller vessel 

35 and therefore support the synthesis of a greater number 
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of oligonucleotides. Increasing the number of codons 
coupled- to a support in a single reaction vessel also 
increases the diversity of the random oligonucleotides. 
The total diversity will be the number of codons coupled 
5 per vessel raised to the number of codon positions 

synthesized. For example, using ten reaction vessels, 
each synthesizing two codons to randomize a total of 
twenty codons, the number of different oligonucleotides 
of ten codons in length per 100 /xm bead can be increased 
10 where each bead will contain about 2 10 or 1 x 10 3 

different sequences instead of one. One skilled in the 
art will know how to modify such parameters to increase 
the diversity of oligonucleotides with random codons. 



Ill A method of synthesizing oligonucleotides having 

^; 15 random codons at each position using individual monomers 

1^* wherein the number of reaction vessels is less than the 

H s number of codons to be randomized is also described. For 

example, if twenty codons are to be randomized at each 
H= position within an oligonucleotide population, then ten 

20 reaction vessels can be used. The use of a smaller 

number of reaction vessels than the number of codons to 
be randomized at each position is preferred because the 
smaller number of reaction vessels is easier to 
manipulate and results in a greater number of possible 
25 oligonucleotides synthesized. 

The use of a smaller number of reaction vessels for 
random synthesis of twenty codons at a desired position 
within an oligonucleotide is similar to that described 
above using twenty reaction vessels except that each 
30 reaction vessel can contain the synthesis products of 
more than one codon. For example, step one synthesis 
using ten reaction vessels proceeds by coupling about two 
different codons on supports contained in each of ten 
reaction vessels. This is shown in Figure 2 where each 
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of the two codons coupled to a different support can 
consist of the following sequences: (1) (T/G) TT for Phe 



(3) (T/OAT for Tyr 

(5) (C/A)TG for Leu 

(7) (A/G)CT for Thr 

(9) (T/G) GG for Trp 



and Val; (2) (T/C) CT for Ser and Pro, 
and His; (4) (T/OGT for Cys and Arg, 
5 and Met; (6) (C/G)AG for Gin and Glu, 
and Ala; (8) (A/G) AT for Asn and Asp; 
and Gly and (10) A (T/A) A for He and Cys. The slash (/) 
signifies that a mixture of the monomers indicated on 
each side of the slash are used as if they were a single 
10 monomer in the * indicated coupling step. The antisense 

sequence for each of the above codons can be generated by 
synthesizing the complementary sequence. For example, 
p the antisense for Phe and Val can be AA(C/A). The amino 

acids encoded by each of the above pairs of sequences are 

TO 

15 given as the standard three letter nomenclature. 



Coupling of the monomers in this fashion will yield 
M< codons specifying all twenty of the naturally occurring 

amino acids attached to supports in ten reaction vessels. 
However, the number of individual reaction vessels to be 
2 0 used will depend on the number of codons to be randomized 
at the desired position and can be determined by one 
skilled in the art. For example, if ten codons are to be 
randomized, then five reaction vessels can be used for 
coupling. The codon sequences given above can be used 
25 for this synthesis as well. The sequences of the codons 
can also be changed to incorporate or be replaced by any 
of the additional forty-four codons which constitutes the 
genetic code . 

The remaining steps of synthesis of oligonucleotides 
30 with random codons using a smaller number of reaction 
vessels are as outlined above for synthesis with twenty 
reaction vessels except that the mixing and dividing 
steps are performed with supports from about half the 
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number of reaction vessels. These remaining steps are 
shown in Figure 2 (steps 2 through 4) . 

Oligonucleotides having at least one specified 
tuplet at a predetermined position and the remaining 
5 positions having random tuplets can also be synthesized 
using the methods described herein. The synthesis steps 
are similar to those outlined above using twenty or less 
reaction vessels except that prior to synthesis of the 
specified codon position, the dividing of the supports 
10 into separate reaction vessels for synthesis of different 
codons is omitted. For example, if the codon at the 
second position of the oligonucleotide is to be 
specified, then following synthesis of random codons at 
the first position and mixing of the supports, the mixed 
15 supports are not divided into new reaction vessels but, 
instead, can be contained in a single reaction vessel to 
synthesize the specified codon. The specified codon is 
synthesized sequentially from individual monomers as 
described above. Thus, the number of reaction vessels 



*W 20 can be increased or decreased at each step to allow for 



the synthesis of a specified codon or a desired number of 



£! random codons . 



Following codon synthesis, the mixed supports are 
divided into individual reaction vessels for synthesis of 

25 the next codon to be randomized (Figure 1, step 3) or can 
be used without separation for synthesis of a consecutive 
specified codon. The rounds of synthesis can be repeated 
for each codon to be added until the desired number of 
positions with predetermined or randomized codons are 

30 obtained . 



Synthesis of oligonucleotides with the first 
position codon being specified can also be synthesized 
using the above method. In this case, the first position 
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codon is synthesized from the appropriate monomers. The 
supports are divided into the required number of reaction 
vessels needed for synthesis of random codons at the 
second position and the rounds of synthesis, mixing and 
5 dividing are performed as described above. 

A method of synthesizing oligonucleotides having 
tuplets which are diverse but biased toward a 
predetermined sequence is also described herein. This 
method employs two reaction vessels, one vessel for the 
10 synthesis of a predetermined sequence and the second 
vessel for the synthesis of a random sequence. This 
p method is advantageous to use when a significant number 

of codon positions, for example, are to be of a specified 

Si 

fii sequence since it alleviates the use of multiple reaction 

SI 15 vessels. Instead, a mixture of four different monomers 

such- as adenine, guanine, cytosine and thymine 
M; nucleotides are used for the first and second monomers in 

3 the codon. The codon is completed by coupling a mixture 

y 3 of a pair of monomers of either guanine and thymine or 

ll! 20 cytosine and adenine nucleotides at the third monomer 

position. In the second vessel, nucleotide monomers are 
□ coupled sequentially to yield the predetermined codon 

sequence. Mixing of the two supports yields a population 
of oligonucleotides containing both the predetermined 
25 codon and the random codons at the desired position. 

Synthesis can proceed by using this mixture of supports 
in a single reaction vessel, for example, for coupling 
additional predetermined codons or, further dividing the 
mixture into two reaction vessels for synthesis of 
30 additional random codons. 

The two reaction vessel method can be used for codon 
synthesis within an oligonucleotide with a predetermined 
tuplet sequence by dividing the support mixture into two 
portions at the desired codon position to be randomized. 
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Additionally, this method allows for the extent of 
randomization to be adjusted. For example, unequal 
mixing or dividing of the two supports will change the 
fraction of codons with predetermined sequences compared 
to those with random codons at the desired position. 
Unequal mixing and dividing of supports can be useful 
when there is a need to synthesize random codons at a 
significant number of positions within an oligonucleotide 
of a longer or shorter length. 

The extent of randomization can also be adjusted by 
using unequal mixtures of monomers in the first, second 
and third monomer coupling steps of the random codon 
position. The unequal mixtures can be in any or all of 
the coupling steps to yield a population of codons 
enriched in sequences reflective of the monomer 
proportions . 

Synthesis of randomized oligonucleotides is 
performed using methods well known to one skilled in the 
art. Linear coupling of monomers can, for example, be 
accomplished using phosphoramidite chemistry with a 
MilliGen/Biosearch Cyclone Plus automated synthesizer as 
described by the manufacturer (Millipore, Burlington, 
MA) . Other chemistries and automated synthesizers can be 
employed as well and are known to one skilled in the art. 

Synthesis of multiple codons can be performed 
without modification to the synthesizer by separately 
synthesizing the codons in individual sets of reactions. 
Alternatively, modification of an automated DNA 
synthesizer can be performed for the simultaneous 
synthesis of codons in multiple reaction vessels. 
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In one embodiment, the invention provides a 
plurality of procaryotic cells containing a diverse 
population of expressible oligonucleotides operationally 
linked to expression elements, the expressible 
5 oligonucleotides having a desirable bias of random codon 
sequences produced from diverse combinations of first and 
second oligonucleotides having a desirable bias of random 
sequences. The invention provides for a method for 
constructing such a plurality of procaryotic cells as 
10 well. 

The oligonucleotides synthesized by the above 
p methods can be used to express a plurality of random 

peptides which are unbiased, diverse but biased toward a 
ni predetermined sequence or which contain at least one 

"Si 15 specified codon at a predetermined position. The need 

T: will determine which type of oligonucleotide is to be 

y> expressed to give the resultant population of random 

3 peptides and is known to one skilled in the art. 

y z Expression can be performed in any compatible vector/host 

*W 2 0 system. Such systems include, for example, plasmids or 

M phagemids in procaryotes such as E. coli , yeast systems, 

CI and other eucaryotic systems such as mammalian cells, but 

will be described herein in context with its presently 
preferred embodiment, i.e. expression on the surface of 
25 filamentous bacteriophage. Filamentous bacteriophage can 
be, for example, Ml 3, fl and fd. Such phage have 
circular single-stranded genomes and double strand 
replicative DNA forms. Additionally, the peptides can 
also be expressed in soluble or secreted form depending 
3 0 on the need and the vector/host system employed. 

Expression of random peptides on the surface of M13 
can be accomplished, for example, using the vector system 
shown in Figure 3. Construction of the vectors enabling 
one of ordinary skill to make them are explicitly set out 
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in Examples I and II. The complete nucleotide sequences 
are given in Figures 5, 6 and 7 *(SEQ ID NOS: 1, 2 and 3, 
respectively) . This system produces random 
oligonucleotides functionally linked to expression 
5 elements and to gVIII by combining two smaller 

oligonucleotide portions contained in separate vectors 
into a single vector. The diversity of oligonucleotide 
species obtained by this system or others described 
herein can be 5 x 10 7 or greater. Diversity of less than 
10 5 x 10 7 can also be obtained and will be determined by 
the need and type of random peptides to be expressed. 
The random combination of two precursor portions into a 
larger oligonucleotide increases the diversity of the 
J*; population several fold and has the added advantage of 

f|j 15 producing oligonucleotides larger than what can be 

y\ synthesized by standard methods. Additionally, although 

|2 the correlation is not known, when the number of possible 

N* paths an oligonucleotide can take during synthesis such 

y as described herein is greater than the number of beads, 

U; 2 0 then there will be a correlation between the synthesis 

path and the sequences obtained. By combining 
p oligonucleotide populations which are synthesized 

Cl separately, this correlation will be destroyed. 

Therefore, any bias which may be inherent in the 
25 synthesis procedures will be alleviated by joining two 
precursor portions into a contiguous random 
oligonucleotide . 

Populations of precursor oligonucleotides to be 
combined into an expressible form are each cloned into 

30 separate vectors. The two precursor portions which make 
up the combined oligonucleotide corresponds to the 
carboxy and amino terminal portions of the expressed 
peptide. Each precursor oligonucleotide can encode 
either the sense or anti-sense and will depend on the 

3 5 orientation of the expression elements and the gene 
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encoding the fusion portion of the protein as well as the 
mechanism used to join the two precursor 
oligonucleotides. For the vectors shown in Figure 3, 
precursor oligonucleotides corresponding to the carboxy 
5 terminal portion of the peptide encode the sense strand. 
Those corresponding to the amino terminal portion encode 
the anti-sense strand. Oligonucleotide populations are 
inserted between the Eco RI and Sac I restriction enzyme 
sites in M13IX22 and M13IX42 (Figure 3A and B) . M13IX42 
10 (SEQ ID NO: 1) is the vector used for sense strand 

precursor oligonucleotide portions and M13IX22 (SEQ ID 
NO: 2) is used for anti-sense precursor portions. 

^: The populations of randomized oligonucleotides 

tn inserted into the vectors are synthesized with Eco RI and 

%I 15 Sac I recognition sequences flanking opposite ends of the 

U! 

- random codon sequences. The sites allow annealing and 
H a ligation of these single strand oligonucleotides into a 

double stranded vector restricted with Eco RI and Sac I. 
y : Alternatively, the oligonucleotides can be inserted into 

*lf 20 the vector by standard mutagenesis methods. In this 

p latter method, single stranded vector DNA is isolated 

C) from the phage and annealed with random oligonucleotides 

having known sequences complementary to vector sequences. 

The oligonucleotides are extended with DNA polymerase to 
25 produce double stranded* vectors containing the randomized 

oligonucleotides . 

The vector used for sense strand oligonucleotide 
portions, M13IX42 (Figure 3B) contains down-stream and in 
frame with the Eco RI and Sac I restriction sites a 
30 sequence encoding the pseudo-wild type gVTII product. 
This gene encodes the wild type M13 gVIII amino acid 
sequence but has been changed at the nucleotide level to 
reduce homologous recombination with the wild type gVIII 
contained on the same vector. The wild type gVIII is 
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present to ensure that at least some functional, non- 
fusion coat protein will be produced. The inclusion of a 
wild type gVIII therefore reduces the possibility of non- 
viable phage production and biological selection against 
5 certain peptide fusion proteins. Differential regulation 
of the two genes can also be used to control the relative 
ratio of the pseudo and wild type proteins. 

Also contained downstream and in frame with the Eco 
RI and Sac I restriction sites is an amber stop codon. 
10 The mutation is located six codons downstream from Sac I 
and therefore lies between the inserted oligonucleotides 
p and the gVIII sequence. As was the function of the wild 

\Q type gVIII, the amber stop codon also reduces biological 

?=j selection when combining precursor portions to produce 

Sj 15 expressible oligonucleotides. This is accomplished by 

^ using a non- suppressor (sup O) host strain because non- 

tufa 

y ; suppressor strains will terminate expression after the 

f oligonucleotide sequences but before the pseudo gVIII 

sequences. Therefore, the pseudo gVIII will never be 
fll 20 expressed on the phage surface under these circumstances. 

if a 

J~ Instead, only soluble peptides will be produced. 

p Expression in a non- suppressor strain can be 

advantageously utilized when one wishes to produce large 
populations of soluble peptides. Stop codons other than 
25 amber, such as opal and ochre, or molecular switches, 

such as inducible repressor elements, can also be used to 
unlink peptide expression from surface expression. 
Additional controls exist as well and are described 
below. 

30 The vector used for anti-sense strand 

oligonucleotide portions, M13IX22, (Figure 3A) , contains 
the expression elements for the peptide fusion proteins. 
Upstream and in frame with the Sac I and Eco RI sites in 
this vector is a leader sequence for surface expression. 



A ribosome binding site and Lac Z promoter/operator 
elements are present for transcription and ■ translation of 
the peptide fusion proteins. 

Both vectors contain a pair of Fok I restriction 
enzyme sites (Figure 3 A and B) for joining together two 
precursor oligonucleotide portions and their vector 
sequences. One site is located at the ends of each 
precursor oligonucleotide which is to be joined. The 
second Fok I site within the vectors is located at the 
end of the vector sequences which are to be joined. The 
5* overhang of this* second Fok I site has been altered to 
encode a sequence which is not- found in the overhangs 
produced at the first Fok I site within the 
oligonucleotide portions. The two sites allow the 
cleavage of each circular vector into two portions and 
subsequent ligation of essential components within each 
vector into a single circular vector where the two 
oligonucleotide precursor portions form a contiguous 
sequence (Figure 3C) . Non-compatible overhangs produced 
at the two Fok I sites allows optimal conditions to be 
selected for performing concatermization or 
circularization reactions for joining the two vector 
portions. Such selection of conditions can be used to 
govern the reaction order and therefore increase the 
efficiency of joining. 

Fok I is a restriction enzyme whose recognition 
sequence is distal to the point of cleavage. Distal 
placement of the recognition sequence in its location to 
the cleavage point is important since if the two were 
superimposed within the oligonucleotide portions to be 
combined, it would lead to an invariant codon sequence at 
the juncture. To alleviate the formation of invariant 
codons at the juncture, Fok I recognition sequences can 
be placed outside of the random codon sequence and still 
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be used to restrict within the random sequence. 
Subsequent annealing of the single -strand overhangs 
produced by Fok I and ligation of the two oligonucleotide 
precursor portions allows the juncture to be formed. A 
5 variety of restriction enzymes restrict DNA by this 
mechanism and can be used instead of Fok I to join 
precursor oligonucleotides without creating invariant 
codon sequences. Such enzymes include, for example, Alw 
I, Bbu I, Bsp MI, Hga I, Hph I, Mbo II, Mnl I, Pie I and 
10 Sfa NI . One skilled in the art knows how to substitute 
Fok I recognition sequences for alternative enzyme 
recognition sequences such as those above, and use the 
appropriate enzyme for joining precursor oligonucleotide 
portions . 



SI 15 Although the sequences of the precursor 

T; ? oligonucleotides are random and will invariably have 

M; oligonucleotides within the two precursor populations 

f whose sequences are sufficiently complementary to anneal 

y : . after cleavage, the efficiency of annealing can be 

ly 20 increased by insuring that the single-strand overhangs 

]S[ within one precursor population will have a complementary 

Q sequence within the second precursor population. This 

can be accomplished by synthesizing a non-degenerate 
series of known sequences at the Fok I cleavage site 
25 coding for each of the twenty amino acids. Since the Fok 
I cleavage site contains a four base overhang, forty 
different sequences are needed to randomly encode all 
twenty amino acids. For example, if two precursor 
populations of ten codons in length are to be combined, 
30 then after the ninth codon position is synthesized, the 
mixed population of supports are divided into forty 
reaction vessels for each of the populations and 
complementary sequences for each of the corresponding 
reaction vessels between populations are independently 
3 5 synthesized. The sequences are shown in Tables III and 



VI of Example I where the oligonucleotides on columns 1R 
through 4 OR form complementary overhangs with the 
oligonucleotides on the corresponding columns 1L through 
40L once cleaved. The degenerate X positions in Table VI 
are necessary to maintain the reading frame once the 
precursor oligonucleotide portions are joined. However, 
use of restriction enzymes which produce a blunt end, 
such as Mnl I can be alternatively used in place of Fok I 
to alleviate the degeneracy introduced in maintaining the 
reading frame . 

The last feature exhibited by each of the vectors is 
an amber stop codon located in an essential coding 
sequence within the vector portion lost during combining 
(Figure 3C) . The amber stop codon is present to select 
for viable phage produced from only the proper • 
combination of precursor oligonucleotides and their 
vector sequences into a single vector species. Other 
non-sense mutations or selectable markers can work as 
well . 

The combining step randomly brings together 
different precursor oligonucleotides within the two 
populations into a single vector (Figure 3C; M13IX) . The 
vector sequences donated from each independent vector, 
M13IX22 and M13IX42, are necessary for production of 
viable phage. Also, since the expression elements are 
contained in M13IX22 and the gVIII sequences are 
contained in M13IX42, expression of functional gVIII- 
peptide fusion proteins cannot be accomplished until the 
sequences are linked as shown in M13IX. 

The combining step is performed by restricting each 
population of vectors containing randomized 
oligonucleotides with Fok I, mixing and ligating (Figure 
3C) . Any vectors generated which contain an amber stop 
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codon will not produce viable phage when introduced into 
a non- suppressor strain (Figure 3D) . Therefore, only the 
sequences which do not contain an amber stop codon will 
make up the final population of vectors contained in the 
5 library. These vector sequences are the sequences 

required for surface expression of randomized peptides. 
By analogous methodology, more than two vector portions 
can be combined into a single vector which expresses 
random peptides. 

10 The invention provides for a method of selecting 

peptides capable of being bound by a ligand binding 
(3 protein from a population of random peptides by (a) 

operationally linking a diverse population of first 

"S! 

HI oligonucleotides having a desirable bias of random codon 

^| 15 sequences to a first vector; (b) operationally linking a 

Is; 

il diverse population of second oligonucleotides having a 

H ! desirable bias of random codon sequences to a second 

?. vector; (c) combining the vector products of steps (a) 

M> and (b) under conditions where said populations of first 

20 and second oligonucleotides are joined together into a 
f*l population of combined vectors; (d) introducing said 

Q population of combined vectors into a compatible host 

under conditions sufficient for expressing said 
population of random peptides; and (e) determining the 
25 peptides which bind to said binding protein. The 

invention also provides for determining the encoding 
nucleic acid sequence of such peptides as well. 

Surface expression of the random peptide library is 
performed in an amber suppressor strain. As described 
3 0 above, the amber stop codon between the random codon 
sequence and the gVIII sequence unlinks the two 
components in a non- suppressor strain. Isolating the 
phage produced from the non- suppressor strain and 
infecting a suppressor strain will link the random codon 



sequences to the gVIII sequence during expression (Figure 
3E) . Culturing the suppressor strain after infection 
allows the expression of all peptide species within the 
library as gVIII -peptide fusion proteins. Alternatively, 
the DNA can be isolated from the non- suppressor strain 
and then introduced into a suppressor strain to 
accomplish the same effect. 

The level of expression of gVIII-peptide fusion 
proteins can additionally be controlled at the 
transcriptional level. The gVIII -peptide fusion proteins 
are under the inducible control of the Lac Z 
promoter/operator system. Other inducible promoters can 
work as well and are known by one skilled in the art. 
For high levels of surface expression, the suppressor 
library is cultured in an inducer of the Lac Z promoter 
such as isopropylthio-6-galactoside (IPTG) . Inducible 
control is beneficial because biological selection 
against non- functional gVIII-peptide fusion proteins can 
be minimized by culturing the library under non- 
expressing conditions. Expression can then be induced 
only at the time of screening to ensure that the entire 
population of oligonucleotides within the library" are 
accurately represented on the phage surface. Also this 
can be used to control the valency of the peptide on the 
phage surface . 

The surface expression library is screened for 
specific peptides which bind ligand binding proteins by 
standard affinity isolation procedures. Such methods 
include, for example, panning, affinity chromatography 
and solid phase blotting procedures. Panning as 
described by Parmley and Smith, Gene 73:305-318 (1988), 
which is incorporated herein by reference, is preferred 
because high titers of phage can be screened easily, 
quickly and in small volumes. Furthermore, this 
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procedure can select minor peptide species within the 
population, which otherwise would have been undetectable, 
and amplified to substantially homogenous populations. 
The selected peptide sequences can be determined by 
sequencing the nucleic acid encoding such peptides after 
amplification of the phage population. 

The invention provides a plurality of procaryotic 
cells containing a diverse population of oligonucleotides 
having a desirable bias of random codon sequences that 
are operationally linked to expression sequences. The 
invention provides for methods of constructing such 
populations of cells as well. 

Random oligonucleotides synthesized by any of the 
methods described previously can also be expressed on the 
surface of filamentous bacteriophage, such as M13, for 
example, without the joining together of precursor 
oligonucleotides. A vector such as that shown in Figure 
4, M13IX30, can be used. This vector exhibits all the 
functional features of the combined vector shown in 
Figure 3C for surface expression of gVI I I -peptide fusion 
proteins. The complete nucleotide sequence for M13IX30 
(SEQ ID 'NO: 3) is shown in Figure 7. 

M13IX30 contains a wild type gVIII for phage 
viability and a pseudo gVIII sequence for peptide 
fusions. The vector also contains in frame restriction 
sites for cloning random peptides. The cloning sites in 
this vector are Xho I, Stu I and Spe I. Oligonucleotides 
should therefore be synthesized with the appropriate 
complementary ends for annealing and ligation or 
insertional mutagenesis. Alternatively, the appropriate 
termini can be generated by PCR technology. Between the 
restriction sites and the pseudo gVIII sequence is an in- 
frame amber stop codon, again, ensuring complete 
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viability of phage in constructing and manipulating the 
library. Expression and screening is performed as 
described above for the surface expression library of 
oligonucleotides generated from precursor portions. 

5 Thus, the invention provides a method of selecting 

peptides capable of being bound by a ligand binding 
protein from a population of random peptides by (a) 
operationally linking a diverse population of 
oligonucleotides having a desirable bias of random codon 

10 sequences to expression elements; (b) introducing said 
population of vectors into a compatible host under 
conditions sufficient for expressing said population of 
random peptides; and (c) determining the peptides which 
bind to said binding protein. Also provided is a method 

15 for determining the encoding nucleic acid sequence of 
such selected peptides. 

The following examples are intended to illustrate, 
but not limit the invention. 



20 EXAMPLE I 

Isolation and Characterization of Peptide Ligands 
GenerBfccad Right and Deft Half Random Oligonucleotides 

This example shows the synthesis of random 
oligonucleotides and the construction and expression of 

25 surface expression libraries of the encoded randomized 
peptides. The random peptides of this example derive 
from the mixing and joining together of two random 
oligonucleotides. Also demonstrated is the isolation and 
characterization of peptide ligands and their 

30 corresponding nucleotide sequence for specific binding 
proteins . 
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Synthesis of Random Oligonucleotides 

The synthesis of two randomized oligonucleotides 
which correspond to smaller portions of a larger 
randomized oligonucleotide is shown below. Each of the 
5 two smaller portions make up one-half of the larger 
oligonucleotide. The population of randomized 
oligonucleotides constituting each half are designated 
the right and left half. Each population of right and 
left .halves are ten codons in length with twenty random 
10 codons at each position. The right half corresponds to 
the sense sequence of the randomized oligonucleotides and 
p encode the carboxy terminal half of the expressed 

j£j peptides. The left half corresponds to the anti-sense 

flj sequence of the randomized oligonucleotides and encode 

*M is the amino terminal half of the expressed peptides. The 



right and left halves of the randomized oligonucleotide 
populations are cloned into separate vector species and 
3 then mixed and joined so that the right and left halves 

y : come together in random combination to produce a single 

t\i 20 expression vector species which contains a population of 

lz[ randomized oligonucleotides twenty codons in length. 

Electroporation of the vector population into an 
appropriate host produces filamentous phage which express 
the random peptides on their surface. 



25 The reaction vessels for oligonucleotide synthesis 

were obtained from the manufacturer of the automated 
synthesizer (Millipore, Burlington, MA; supplier of 
MilliGen/Biosearch Cyclone Plus Synthesizer) . The 
vessels were supplied as packages containing empty 

3 0 reaction columns (1 /xmole) , frits, crimps and plugs 

(MilliGen/Biosearch catalog # GEN 860458) . Derivatized 
and underivatized control pore glass, phosphoramidite 
nucleotides, and synthesis reagents were also obtained 
from MilliGen/Biosearch. Crimper and decrimper tools 
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were obtained from Fisher Scientific Co., Pittsburgh, PA 
(Catalog numbers 06-406-20 and 06-406-25A, respectively) . 

Ten reaction 'columns were used for right half 
synthesis of random oligonucleotides ten codons in 
5 length. The oligonucleotides have 5 monomers at their 3' 
end of the sequence 5*GAGCT3' and 8 monomers at their 5' 
end of the sequence 5 1 AATTCCAT3 ' . The synthesizer was 
fitted with a column derivatized with a thymine 
nucleotide (T-column, MilliGen/Biosearch # 0615.50) and 
10 was programmed to synthesize the sequences shown in Table 
I for each of ten columns in independent reaction sets. 
Q The sequence of the last three monomers (from right to 

***** left since synthesis proceeds 3' to 5 ' ) encode the 
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Hi 



indicated amino acids: 



15 Table I 



Sequence 



a 20 
□ 
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Column 




(5 ■ to 3 ' ) 


Amino Acids 


column 


1R 


(T/G) TTGAGCT 


Phe 


and 


Val 


column 


2R 


(T/C) CTGAGCT 


Ser 


and 


Pro 


column 


3R 


( T/C ) ATGAGCT 


Tyr 


and 


His 


column 


4R 


(T/OGTGAGCT 


Cys 


and 


Arg 


column 


5R 


(C/A) TGGAGCT 


Leu 


and 


Met 


column 


6R 


(C/G) AGGAGCT 


Gin 


and 


Glu 


column 


7R 


(A/G) CTGAGCT 


Thr 


and 


Ala 


column 


8R 


(A/G) ATGAGCT 


Asn 


and Asp 


column 


9R 


(T/G) GGGAGCT 


Trp 


and 


Gly 


column 


1R 


A ( T/A) AGAGCT 


He 


and 


Cys 



where the two monomers in parentheses denote a single 

3 0 monomer position within the codon and indicate that an 

equal mixture of each monomer was added to the reaction 

for coupling. The monomer coupling reactions for each 
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of the 10 columns were performed as recommended by the 
manufacturer (amidite version SI. 06, # 8400-050990, scale 
1 /xM) . After the last coupling reaction, the columns 
were washed with acetonitrile and lyophilized to dryness. 

5 Following synthesis, the plugs were removed from 

each column using a decrimper and the reaction products 
were poured into a single weigh boat. Initially the bead 
mass increases, due to the weight of the monomers, 
however, at later rounds of synthesis material is lost. 
10 In either case, the material was equalized with 

underivatized control pore glass and mixed thoroughly to 
p obtain a random distribution of all twenty codon species. 

^; The reaction products were then aliquotted into 10 new 

nl reaction columns by removing 25 mg of material at a time 

Si 15 and placing it into separate -reaction columns. 

i « S 

p Alternatively, the reaction products can be aliquotted by 

H s suspending the beads in a liquid that is dense enough for 
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Us? 



the beads to remain dispersed, preferably a liquid that 
is equal in density to the beads, and then aliquot ing 
20 equal volumes of the suspension into separate reaction 
columns. The lip on the inside of the columns where the 
p frits rest was cleared of material using vacuum suction 

with a syringe and 25 G needle. New frits were placed 
onto the lips, the plugs were fitted into the columns and 
25 were crimped into place using a crimper. 

Synthesis of the second codon position was achieved 
using the above 10 columns containing the random mixture 
of reaction products from the first codon synthesis. The 
monomer coupling reactions for the second codon position 
30 are shown in Table II. An A in the first position means 
that any monomer can be programmed into the synthesizer. 
At that position, the first monomer position is not 
coupled by the synthesizer since the software assumes 
that the monomer is already attached to the column. An A 
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also denotes that the columns from the previous codon 
synthesis should be placed on the synthesizer for use in 
the present synthesis round. Reactions were again 
sequentially repeated for each column as shown in Table 
5 II and the reaction products washed and dried as 
described above. 



Table II 



Column 




Sequence 
(5 ' to 3 ' ) 


Amino Acids 


column 


1R 


(T/G) tta 


Phe 


and Val 


column 


2R 


(T/OCTA 


Ser 


and Pro 


column 


3R 


(T/C) ATA 


Tyr 


and His 


column 


4R 


(T/OGTA 


Cys 


and Arg 


column 


5R 


(C/A) TGA 


Leu 


and Met 


column 


6R 


(C/G) AGA 


Gin 


and Glu 


column 


7R . 


(a/g) cta 


Thr 


and Ala 


column 


8R 


(A/G) ATA 


Asn 


and Asp 


column 


9R 


(T/G) gga 


Trp 


and Gly 


column 


10R 


A (T/A) AA 


He 


and Cys 



20 Randomization of the second codon position was achieved 
by removing the reaction products from each of the 
columns and thoroughly mixing the material. The material 
was again divided into new reaction columns and prepared 
for monomer coupling reactions as described above. 



25 Random synthesis of the next seven codons (positions 

3 through 9) proceeded identically to the cycle described 
above for the second codon position and again used the 
monomer sequences of Table II. Each of the newly 
repacked columns containing the random mixture of 

3 0 reaction products from synthesis of the previous codon 
position was used for the synthesis of the subsequent 
codon position. After synthesis of the codon at position 
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nine and mixing of the reaction products, the material 
was divided and repacked into 4 0 different columns and 
the monomer sequences shown in Table III were coupled to 
each of the 40 columns in independent reactions. The 
oligonucleotides from each of the 4 0 columns were mixed 
once more and cleaved from the control pore glass as 
recommended by the manufacturer. 

Table III 



Column 




Sequence (5' to 3 1 ) 


column 


1R 


AATTCTTTTA 


column 


2R 


AATTCTGTTA 


column 


3R 


AATTCGTTTA 


column 


4R 


AATTCGGTTA 


column 


5R 


AATTCTTCTA 


column 


6R 


AATTCTCCTA 


column 


7R 


AATTCGTCTA 


column 


8R 


AATTCGCCTA 


column 


9R 


AATTCTTATA 


column 


10R 


AATTCTCATA 


column 


11R 


AATTCGTATA 


column 


12R 


AATTCGCATA 


column 


13R 


AATTCTTGTA 


column 


14R 


AATTCTCGTA 


column 


15R 


AATTCGTGTA 


column 


16R 


AATTCGCGTA 


column 


17R 


AATTCTCTGA 


column 


18R 


AATTCTATGA 


column 


19R 


AATTCGCTGA 


column 


20R 


AATTCGATGA 


column 


21R 


AATTCTCAGA 


column 


22R 


AATTCTGAGA 


column 


23R 


AATTCGCAGA 


column 


24R 


AATTCGGAGA 


column 


25R 


AATTCTACTA 
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column 


26R 


AATTCTGCTA 




column 


27R 


AATTCGACTA 




column 


28R 


AATTCGGCTA 




column 


29R 


AATTCTAATA 


5 


column 


30R 


AATTCTGATA 




column 


31R 


AATTCGAATA 




column 


32R 


AATTCGGATA 




column 


33R 


AATTCTTGGA 




column 


34R 


AATTCTGGGA 


10 


column 


35R 


AATTCGTGGA 




column 


36R 


AATTCGGGGA 




column 


37R 


AATTCTATAA 




column 


38R 


AATTCTAAAA 




column 


3 9R 


AATTCGATAA 


15 


column 


40R 


AATTCGAAAA 



Left half synthesis of random oligonucleotides 
proceeded similarly to the right half synthesis. This 
half of the oligonucleotide corresponds to the anti-sense 

20 sequence of the encoded randomized peptides. Thus, the 
complementary sequence of the codons in Tables I through 
III are synthesized. The left half oligonucleotides also 
have 5 monomers at their 3' end of the sequence 5 ! GAGCT3' 
and 8 monomers at their 5 ' end of the sequence 

25 5 ' AATTCCAT3 ' . The rounds of synthesis, washing, drying, 
mixing, and dividing are as described above. 
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For the first codon position, the synthesizer .was 
fitted with a T-column and programmed to synthesize the 
sequences shown in Table IV for each of ten columns in 
independent reaction sets. As with right half synthesis, 
the sequence of the last three monomers (from right to 
left) encode the indicated amino acids: 
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Table IV 



Sequence 



0 



ru 

SI 

M- 

N j 

N: 

iti 



10 



15 



20 



C**r\ "1 i lmn 






Amino Acids 


z^t /-\ ~\ 1 1 mm 
_L LLUUl 


1 T. 




Phe 


and Val 


column 


2L 


AG (A/G) GAGCT 


Ser 


and Pro 


column 


3L 


AT ( A/G) GAGCT 


Tyr 


and His 


column 


4L 


AC (A/G) GAGCT 


Cys 


and Arg 


column 


5L 


CA(G/T) GAGCT 


Leu 


and Met 


column 


6L 


CT(G/C) GAGCT 


Gin 


and Glu 


column 


7L 


AG (T/C) GAGCT 


Thr 


and Ala 


column 


8L 


AT (T/C) GAGCT 


Asn 


and Asp 


column 


9L 


CC(A/C)GAGCT 


Trp 


and Gly 


column 


10L 


T (A/T) TGAGCT 


lie 


and Cys 



Following washing and drying, the plugs for each column 
were removed, mixed and aliquotted into ten new reaction 
columns as described above. Synthesis of the second 
codon position was achieved using these ten columns 
containing the random mixture of reaction products from 
the first codon synthesis. The monomer coupling 
reactions for the second codon position are shown in 
Table V. 



Table V 



Sequence 





Column 




(5 ' to 3 ' ) 


Amino Acids 


25 


column 


1L 


AA (A/C) A 


Phe and Val 




column 


2L 


AG (A/G) A 


Ser and Pro 




column 


3L 


' AT (A/G) A 


Tyr and His 




column 


4L 


AC (A/G) A 


Cys and Arg 




column 


5L 


CA ( G/T ) A 


Leu and Met 


30 


column 


6L 


CT(G/C) A 


Gin and Glu 




column 


7L 


AG (T/C) A 


Thr and Ala 
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column 8L 
column 9L 
column 10L 



AT (T/C) A 
CC(A/C)A 
T (A/T) TA 



Asn and Asp 
Trp and Gly 
lie and Cys 



Again, randomization of the second codon position was 
achieved by removing the reaction products from each of 
the columns and thoroughly mixing the beads . The beads 
were repacked into ten new reaction columns. 
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Random synthesis of the next seven codon positions 
proceeded identically to the cycle described above for 
the second codon position and again used the monomer 
sequences of Table V. After synthesis of the codon at 
position nine and mixing of the reaction products, the 
material was divided and repacked into 40 different 
columns and the monomer sequences shown in Table VI were 
coupled to each of the 4 0 columns in independent x 
reactions . 

Table VI 





Column 




Sequence (5 ' to 3 ' ) 


20 


column 


1L 


AATTCCATAAAAXXA 




column 


2L 


AATT C C AT AAACXXA 




column 


3L 


AATTCCATAACAXXA 




column 


4L 


AATTCCATAACCXXA 




column 


5L 


AATTCCATAGAAXXA 


25 


column 


6L 


AATTCCATAGACXXA 




column 


7L 


AATT C C AT AGG AXXA 




column 


8L 


AATTCCATAGGCXXA 




column 


9L 


AATTCCATATAAXXA 




column 


10L 


AATTCCATATACXXA 


30 


column 


11L 


AATTCCATATGAXXA 




column 


12L 


AATTCCATATGCXXA 




column 


13L 


AATTCCATACAAXXA 




column 


14L 


AATTCCATACACXXA 




38 





column 


15L 


AATTCCATACGAXXA 




column 


16L 


AATTCCATACGCXXA 




column 


17L 


AATTCCATCAGAXXA 




column 


18L 


AATTCCATCAGCXXA 


5 


column 


19L 


AATTCCATCATAXXA 




column 


20L 


AATTCCATCATCXXA 




column 


21L 


AATTCCATCTGAXXA 




column 


22L 


AATTCCATCTGCXXA 




column 


23L 


AATTCCATCTCAXXA 


10 


column 


24L 


AATTCCATCTCCXXA 




column 


25L 


AATTCCATAGTAXXA 




column 


26L 


AATTCCATAGTCXXA 




column 


27L 


AATTCCATAGCAXXA 




column 


28L 


AATTCCATAGCCXXA 


15 


column 


29L 


AATTCCATATTAXXA 




column 


30L 


AATTCCATATTCXXA 




column 


31L 


AATTCCATATCAXXA 




column 


32L 


AATTCCATATCCXXA 




column 


33L 


AATTCCATCCAAXXA 


20 


column 


34L 


AATTCCATCCACXXA 




column 


35L 


AATTCCATCCCAXXA 




column 


36L 


AATTCCATCCCCXXA 




column 


37L 


AATTCCATTATAXXA 




column 


38L 


AATTCCATTATCXXA 


25 


column 


39L 


AATTCCATTTTAXXA 




column 


40L 


AATTCCATTTTCXXA 



The first two monomers denoted by an "X" represent an 
equal mixture of all four nucleotides at that position. 
This is necessary to retain a relatively unbiased codon 
30 sequence at the junction between right and left half 

oligonucleotides. The above right and left half random 
oligonucleotides were cleaved and purified from the 
supports and used in constructing the surface expression 
libraries below. 
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Vector Construction 

Two M13-based vectors, M13IX42 (SEQ ID NO: 1) and 
M13IX22 (SEQ ID NO: 2), were constructed for the cloning 
and propagation of right and left half populations of 
5 random oligonucleotides, respectively. The vectors were 
specially constructed to facilitate the random joining 
and subsequent expression of right and left half 
oligonucleotide populations. Each vector within the 
population contains one right and one left half 
10 oligonucleotide from the population joined together to 
form a single contiguous oligonucleotide with random 
q codons which is twenty-two codons in length. The 

^6 resultant population of vectors are used to construct a 

SJ 

f%\ surface expression library. 

SI 

15 M13IX42, or the right-half vector, was constructed 

to harbor the right half populations of randomized 
? oligonucleotides. M13mpl8 (Pharmacia, Piscataway, NJ) 

y, was the starting vector. This vector was genetically 

lU modified to contain, in addition to the encoded wild type 

^S; 20 M13 gene VIII already present in the vector: (1) a 

Q pseudo-wild type M13 gene VIII sequence with a stop codon 

(amber) placed between it and an Eco Rl-Sac I cloning 
site for randomized oligonucleotides; (2) a pair of Fok I 
sites to be used for joining with M13IX22, the left-half 
25 vector; (3) a second amber stop codon placed on the 
opposite side of the vector than the portion being 
combined with the left-half vector; and (4) various other 
mutations to remove redundant restriction sites and the 
amino terminal portion of Lac Z. 

3 0 The pseudo-wild type M13 gene VIII was used for 

surface expression of random peptides. The pseudo-wild 
type gene encodes the identical amino acid sequence as 
that of the wild type gene; however, the nucleotide 
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.sequence has been altered so that only 63% identity 
exists between this gene and the encoded wild type gene 
VIII. Modification of the gene VIII nucleotide sequence 
used for surface expression reduces the possibility of 
homologous recombination with the wild type gene VIII 
contained on the same vector. Additionally, the wild 
type M13 gene VIII was retained in the vector system to 
ensure that at least some functional, non- fusion, coat 
protein would be produced. The inclusion of wild type 
gene VIII therefore reduces the possibility of non-viable 
phage production from the random peptide fusion genes. 
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The pseudo-wild type gene VIII was constructed by 
chemically synthesizing a series of oligonucleotides 
which encode both strands of the gene. The 
oligonucleotides are presented in Table VII (SEQ ID NOS: 
7 through 16) . 

TABLE VII 

Pseudo-Wild Type Gene VIII Oligonucleotide Series 



20 



Top Strand 
01 igonuc 1 eot ide s 



Sequence (5' to 3 ' ) 



25 



VIII 03 



VIII 04 



VIII 05 



VIII 06 



VIII 07 



GATCC TAG GCT GAA GGC GAT 

GAC CCT GCT AAG GCT GC 

A TTC AAT AGT TTA CAG GCA 

AGT GCT ACT GAG TAC A 

TT GGC TAC GCT TGG GCT ATG 

GTA GTA GTT ATA GTT 

GGT GCT ACC ATA GGG ATT AAA 

TTA TTC AAA AAG TT 

T ACG AGC AAG GCT TCT TA 



30 



Bottom Strand 




41 

Oligonucleotides 



VIII 


08 


AGC 


TTA 


AGA 


AGC 


CTT 


GCT 


CGT 






AAA 


CTT 


TTT 


GAA 


TAA 


TTT 




VIII 


09 


AAT 


CCC 


TAT 


GGT 


AGC 


ACC 


AAC 






TAT 


AAC 


TAC 


TAC 


CAT 






VIII 


10 


AGC 


CCA 


AGC 


GTA 


GCC 


AAT 


GTA 






CTC 


AGT 


AGC 


ACT 


TG 






VIII 


11 


C CTG TAA ACT ATT GAA TGC 






AGC 


CTT 


AGC 


AGG 


GTC 






VIII 


12 


ATC 


GCC 


TTC 


AGC 


CTA 


G 





Q Except for the terminal oligonucleotides VIII 03 

*G (SEQ ID NO: 7) and VIII 08 (SEQ ID NO: 12), the above 

fil oligonucleotides (oligonucleotides VIII 04 -VIII 07 and 

SI 09-12 (SEQ ID NOS: 8 through 11 and 13 through 16)) were 

15 mixed at 200 ng each in 10 [xl final volume and 

N» phosphorylated with T4 polynucleotide Kinase (Pharmacia, 

Piscataway, NJ) with 1 mM ATP at 37°C for 1 hour. The 

y : reaction was stopped at 65°C for 5 minutes. Terminal 

fl! oligonucleotides were added to the mixture and annealed 



t S; 20 into double -stranded form by heating to 65°C for 5 

CI minutes, followed by cooling to room temperature over a 

period of 30 minutes. The annealed oligonucleotides were 
ligated together with 1 . 0 U of T4 DNA ligase (BRL) . The 
annealed and ligated oligonucleotides yield a double- 
25 stranded DNA flanked by a Bam HI site at its 5 ! end and 
by a Hind III site at its 3 1 end. A translational stop 
codon (amber) immediately follows the Bam HI site. The 
gene VIII sequence begins with the codon GAA (Glu) two 
codons 3' to the stop codon. The double- stranded insert 
30 was phosphorylated using T4 DNA Kinase (Pharmacia, 

Piscataway, NJ) and ATP (10 mM Tris-HCl, pH 7.5, 10 mM 
MgCl 2 ) and cloned in frame with the Eco RI and Sac I 
sites within the M13 polylinker. To do so, M13mpl8 was 
digested with Bam HI (New England Biolabs, Beverley, MA) 



and Hind III (New England Biolabs) and combined at a 
molar ratio of 1:10 with the double- stranded insert. The 
ligations were performed at 16 °C overnight in IX ligase 
buffer (50 mM Tris-HCl, pH 7.8, 10 mM MgCl 2/ 20 mM DTT, 1 
mM ATP, 50 ng/ml BSA) containing 1 . 0 U of T4 DNA ligase 
(New England Biolabs) . The ligation mixture was 
transformed into a host and screened for positive clones 
using standard procedures in the art . 

Several mutations were generated within the right- 
half vector to yield functional M13IX42. The mutations 
were generated using the method of Kunkel et al . , Meth. 
Enzymol. 154:367-382 (1987), which is incorporated herein 
by reference, for site-directed mutagenesis. The 
reagents, strains and protocols were obtained from a Bio 
Rad Mutagenesis kit (Bio Rad, Richmond, CA) and 
mutagenesis was performed as recommended by the 
manufacturer . 

A Fok I site used for joining the right and left 
halves was generated 8 nucleotides 5 ! to the unique Eco 
RI site using the oligonucleotide 5 ' -CTCGAATTCGTACATCCT 
GGTCATAGC-3 1 (SEQ ID NO: 17). The second Fok I site 
retained in the vector is naturally encoded at position 
3 547; however, the sequence within the overhang was 
changed to encode CTTC. Two Fok I sites were removed 
from. the vector at positions 23 9 and 7244 of M13mpl8 as 
well as the Hind III site at the end of the pseudo gene 
VIII sequence using the mutant oligonucleotides 5 1 - 
CATTTTTGCAGATGGCTTAGA -3' (SEQ ID NO: 18) and 5'- 
TAGCATTAACGTCCAATA-3 ' (SEQ ID NO: 19), respectively. New 
Hind III and Mlu I sites were also introduced at position 
3919 and 3951 of M13IX42. The oligonucleotides used for 
this mutagenesis had the sequences 5'- 

ATATATTTTAGTAAGCTTCATCTTCT - 3 ' (SEQ ID NO: 20) and 5'- 
GACAAAGAACGCGTGAAAACTTT-3 ■ (SEQ ID NO: 21), respectively. 
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The amino terminal portion of Lac Z was deleted by 
oligonucleotide-directed mutagenesis using the mutant 
oligonucleotide 5 » -GCGGGCCTCTTCGCTATTGCTTAAGAAGCCTTGCT-3 ' 
(SEQ ID NO: 22) . This deletion also removed a third 
5 M13mpl8 derived Fok I site. The distance between the Eco 
RI and Sac I sites was increased to ensure complete 
double digestion by inserting a spacer sequence. The 
spacer sequence was inserted using the oligonucleotide 
5 ' -TTCAGCCTAGGATCCGCCGAGCTCTCCTACCTGCGAATTCGTACATCC- 3 1 
10 (SEQ ID NO: 23) . Finally, an amber stop codon was placed 
at position 4492 using the mutant oligonucleotide 5'- 
TGGATTATACTTCTA AATAATGGA-3 1 (SEQ ID NO: 24) . The amber 
{*; stop codon is used as a biological selection to ensure 

J*; the proper recombination of vector sequences to bring 

fy 15 together right and left halves of the randomized 

y\ oligonucleotides. In constructing the above mutations, 

^ a all changes made in a M13 coding region were performed 

N s such that the amino acid sequence remained unaltered. It 

^ should be noted that several mutations within M13mpl8 

M* 20 were found which differed from the published sequence. 

l ^ Where known, these sequence differences are recorded 

l™\ herein as found and therefore may not correspond exactly 

*3 to the published sequence of M13mpl8 . 

The sequence of the resultant vector, M13IX42, is 
25 shown in Figure 5 (SEQ ID NO: 1) . Figure 3A also shows 
M13IX42 where each of the elements necessary for 
producing a surface expression library between right and 
left half randomized oligonucleotides is marked. The 
sequence between the two Fok I sites shown by the arrow 
30 is the portion of M13IX42 which is to be combined with a 
portion of the left -half vector to produce random 
oligonucleotides as fusion proteins of gene VIII. 



M13IX22, or the left-half vector, was constructed to 
harbor the left half populations of randomized 
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oligonucleotides. This vector was constructed from 
M13mpl9 (Pharmacia, Piscataway, NJ) and contains: (1) 
Two Fok I sites for mixing with M13IX42 to bring together 
the left and right halves of the randomized 
5 oligonucleotides; (2) sequences necessary for expression 
such as a promoter and signal sequence and translation 
initiation signals; (3) an Eco Rl-Sac I cloning site for 
the randomized oligonucleotides; and (4) an amber stop 
codon for biological selection in bringing together right 
10 and left half oligonucleotides. 

Of the two Fok I sites used for mixing M13IX22 with 

p M13IX42, one is naturally encoded in M13mpl8 and M13mpl9 

*G (at position 3547) . As with M13IX42, the overhang 
Si 

pi within this naturally occurring Fok I site was changed to 

Si 15 CTTC. The other Fok I site was introduced after 

[J* construction of the translation initiation signals by 

H- site-directed mutagenesis using the oligonucleotide 5'- 
TAACACTCATTCCGGATGGAATTCTGGAGTCTGGGT-3 ' (SEQ ID NO: 25). 



3 

M- 

ftl 

Mr? 



The translation initiation signals were constructed 
20 by annealing of overlapping oligonucleotides as described 
above to produce a double -stranded insert containing a 5 ! 
Eco RI site and a 3' Hind III site. The overlapping 
oligonucleotides are shown in Table VIII (SEQ ID NOS : 2 6 
through 34) and were ligated as a double -stranded insert 
25 between the Eco RI and Hind III sites of M13mpl8 as 

described for the pseudo gene VIII insert. The ribosome 
binding site (AGGAGAC) is located in oligonucleotide 015 
(SEQ ID NO: 26) and the translation initiation codon 
(ATG) is the first three nucleotides of oligonucleotide 
30 016 (SEQ ID NO: 27) . 
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TABLE VIII 

Oligonucleotide Series for Construction of 
Translation Signals in M13IX22 

Oligonucleotide Sequence (5 1 to 3 ' ) 

AATT C GCC AAG GAG ACA GTC AT 
AATG AAA TAC CTA TTG CCT ACG 
GCA GCC GCT GGA TTG TT 
ATTA CTC GCT GCC CAA CCA GCC 
ATG GCC GAG CTC GTG AT 
GACC CAG ACT CCA GATATC CAA CAG 
GAA TGA GTG TTA AT 
TCT AGA ACG CGT C 
ACGT G ACG CGT TCT AGA AT TAA 
CACTCA TTC CTG T 

TG GAT ATC TGG AGT CTG GGT CAT 
CAC GAG CTC GGC CAT G 
GC TGG TTG GGC AGC GAG TAA TAA 
CAA TCC AGC GGC TGC C 
GT AGG CAA TAG GTA TTT CAT TAT 
GAC TGT CCT TGG CG 

Oligonucleotide 017 (SEQ ID NO: 27) contained a Sac I 
restriction site 67 nucleotides downstream from the ATG 
codon. The naturally occurring Eco RI site was removed 
and a new site introduced 25 nucleotides downstream from 
25 the Sac I. Oligonucleotides 5'- 

TGACTGTCTCCTTGGCGTGTGAAATTGTTA-3 1 (SEQ ID NO: 35) and 5»- 
TAACACTCATTCCGGATGGAATTCTGGAGTCT 

GGGT- 3 1 (SEQ ID NO: 36) were used to generate each of the 
mutations, respectively. An amber stop codon was also 
30 introduced at position 3263 of M13mpl8 using the 

oligonucleotide 5 ' -CAATTTTATCCTAAATCTTACCAAC-3 ' (SEQ ID 
NO: 37) . 
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In addition to the above mutations, a variety of 
other modifications were made to remove certain sequences 
and redundant restriction sites. The LAC Z ribosome 
binding site was removed when the original Eco RI site in 
5 M13mpl8 was mutated. Also, the Fok I sites at positions 
239, 6361 and 7244 of M13mpl8 were likewise removed with 
mutant oligonucleotides 5 ' -CATTTTTGCAGATGGCTTAGA-3 ' (SEQ 
ID NO: 38), 5 1 -CGAAAGGGGGGTGTGCTGCAA- 3 ' (SEQ ID NO: 39) 
and 5 ' -TAGCATTAACGTCCAATA-3 ' (SEQ ID NO: 40), 
10 respectively. Again, mutations within the coding region 
did not alter the amino acid sequence. 



The resultant vector, M13IX22, is 7320 base pairs in 
'€i length, the sequence of which is shown in Figure 6 (SEQ 

f {\ ID NO: 2) . The Sac I and Eco RI cloning sites are at 

S! 15 positions 6290 and 6314, respectively. Figure 3A also 

shows M13IX22 where each of the elements necessary for 
y : producing a surface expression library between right and 

3 left half randomized oligonucleotides is marked. 

til Library Construction 

la? 

U 

(3 20 Each population of right and left half randomized 

oligonucleotides from columns 1R through 4 OR and columns 
1L through 40L are cloned separately into M13IX42 and 
M13IX22, respectively, to create sublibraries of right 
and left half randomized oligonucleotides. Therefore, a 

25 total of eighty sublibraries are generated. Separately 
maintaining each population of randomized 
oligonucleotides until the final screening step is 
performed to ensure maximum efficiency of annealing of 
right and left half oligonucleotides. The greater 

30 efficiency increases the total number of randomized 

oligonucleotides which can be obtained. Alternatively, 
one can combine all forty populations of right half 
oligonucleotides (columns 1R-40R) into one population and 



of left half oligonucleotides (columns 1L-40L) into a 
second population to generate just one sublibrary for 
each. 

For the generation of sublibraries , each of the 
above populations of randomized oligonucleotides are 
cloned separately into the appropriate vector. The right 
half oligonucleotides are cloned into M13IX42 to generate 
sublibraries M13IX42.1R through M13IX42.40R. The left 
half oligonucleotides are similarly cloned into M13IX22 
to generate sublibraries M13IX22.1L through M13IX22 . 40L. 
Each vector contains unique Eco RI and Sac I restriction 
enzyme sites which produce 5' and 3' single-stranded 
overhangs, respectively, when digested. The single 
strand overhangs are used for the annealing and ligation 
of the complementary single -stranded random 
oligonucleotides . 

The randomized oligonucleotide populations are 
cloned between the Eco RI and Sac I sites by sequential 
digestion and ligation steps. Each vector is treated 
with an excess of Eco RI (New England Biolabs) at 37°C 
for 2 hours followed by addition of 4-24 units of calf 
intestinal alkaline phosphatase (Boehringer Mannheim, 
Indianapolis, IN) . Reactions are stopped by 
phenol/chloroform extraction and ethanol precipitation. 
The pellets are resuspended in an appropriate amount of 
distilled or deionized water (dH 2 0) . About 10 pmol of 
vector is mixed with a 5000-fold molar excess of each 
population of randomized oligonucleotides in 10 fil of IX 
ligase buffer (50 mM Tris-HCl, pH 7.8, 10 mM MgCl 2 , 20 mM 
DTT, 1 mM ATP, 50 fig/ml BSA) containing 1 . 0 U of T4 DNA 
ligase (BRL, Gaithersburg, MD) . The ligation is 
incubated at 16°C for 16 hours. Reactions are stopped by 
heating at 75°C for 15 minutes and the DNA is digested 
with an excess of Sac I (New England Biolabs) for 2 
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hours. Sac I is inactivated by heating at 75°C for 15 
minutes and the volume of the reaction mixture is 
adjusted to 300 /il with an appropriate amount of 10X 
ligase buffer and dH 2 0. One unit of T4 DNA ligase (BRL) 
5 is added and the mixture is incubated overnight at 16 °C. 
The DNA is ethanol precipitated and resuspended in TE (10 
mM Tris-HCl, pH 8.0, 1 mM EDTA) . DNA from each ligation 
is electroporated into XL1 Blue™ cells (Stratagene, La 
Jolla, CA) , as described below, to generate the 
10 sublibraries . 

E. coli XL1 Blue™ is electroporated as described by- 
Smith et al., Focus 12:38-40 (1990) which is incorporated 
herein by reference. The cells are prepared by 
*=" inoculating a fresh colony of XLls into 5 mis of SOB 

15 without magnesium (20 g bacto- tryptone , 5 g bacto-yeast 
extract, 0.584 g NaCl, 0.186 g KC1 , dH 2 0 to 1,000 mis) 
and grown with vigorous aeration overnight at 37°C. SOB 
* without magnesium (500 ml) is inoculated at 1:1000 with 

the overnight culture and grown with vigorous aeration at 
fU 20 37°C until the OD 550 is 0.8 (about 2 to 3 h) . The cells 

^ are harvested by centrif ugation at 5,000 rpm (2,600 x g) 

in a GS3 rotor (Sorvall, Newtown, CT) at 4°C for 10 
minutes, resuspended in 500 ml of ice-cold 10% (v/v) 
sterile glycerol and centrifuged and resuspended a second 
25 time in the same manner. After a third centrif ugation, 
the cells are resuspended in 10% sterile glycerol -at a 
final volume of about 2 ml, such that the OD 550 of the 
suspension is 200 to 300. Usually, resuspension is 
achieved in the 10% glycerol that remains in the bottle 
30 after pouring off the supernate. Cells are frozen in 40 
/xl aliquots in microcentrifuge tubes using a dry ice- 
ethanol bath and stored frozen at -70°C. 
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Frozen cells are electroporated by thawing slowly on 
ice before use and mixing with about 10 *pg to 500 ng of 
vector per 40 /xl of cell suspension. A 40 fil aliquot is 
placed in an 0 . 1 cm electroporation chamber (Bio-Rad, 
5 Richmond, CA) and pulsed once at 0°C using 200 Q parallel 
resistor, 25 fxF , 1.88 kV, which gives a pulse length (t) 
of ~4 ms . A 10 fxl aliquot of the pulsed cells are 
diluted into 1 ml SOC (98 mis SOB plus 1 ml of 2 M MgCl 2 
and 1 ml of 2 M glucose) in a 12- x 75 -mm culture tube, 
10 and the culture is shaken at 37°C for 1 hour prior to 
culturing in selective media, (see below) . 

P Each of the eighty sublibraries are cultured using 

^ methods known to one skilled in the art. Such methods 

f|i can be found in Sanbrook et al . , Molecular Cloning: 4 A 

15 Laboratory Manuel, Cold Spring Harbor Laboratory, Cold 

§ a I 

Spring Harbor, 1989, and in Ausubel et al . , Current 
N* Protocols in Molecular Biology, John Wiley and Sons, New 

York, 198 9, both of which are incorporated herein by 
y, reference. Briefly, the above 1 ml sublibrary cultures 

ftj 20 were grown up by diluting 50 -fold into 2XYT media (16 g 

A tryptone, 10 g yeast extract, 5 g NaCl) and culturing at 

l3 37°C for 5-8 hours. The bacteria were pelleted by 

centrif ugation at 10,000 xg. The supernatant containing 

phage was transferred to a sterile tube and stored at 
25 4°C. » 

Double strand vector DNA containing right and left 
half randomized oligonucleotide inserts is isolated from 
the cell pellet of each sublibrary. Briefly, the pellet 
is washed in TE (10 mM Tris, pH 8.0, 1 mM EDTA) and 
30 recollected by centrif ugation at 7,000 rpm for 5 1 in a 

Sorval centrifuge (Newtown, CT) . Pellets are resuspended 
in 6 mis of 10% Sucrose, 50 mM Tris, pH 8 . 0 . 3 . 0 ml of 
10 mg/jxl lysozyne is added and incubated on ice for 2 0 
minutes. 12 mis of 0 . 2 M NaOH, 1% SDS is added followed 
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by 10 minutes on ice. The suspensions are then incubated 
on ice for 20 minutes after addition of 7.5 mis of 3 M 
NaOAc, pH 4.6. The samples are centrifuged at 15,000 rpm 
for 15 minutes at 4°C, RNased and extracted with 
5 phenol/chloroform, followed by ethanol precipitation. 
The pellets are resuspended, weighed and an equal weight 
of CsCl 2 is dissolved into each tube until a density of 
1.60 g/ml is achieved. EtBr is added to 600 //g/ml and 
the double -stranded DNA is isolated by equilibrium 
10 centrifugation in a TV-1665 rotor (Sorval) at 50,000 rpm 
for 6 hours. These DNAs from each right and left half 
sublibrary are used to generate forty libraries in which 
p the right and left halves of the randomized 

& oligonucleotides have been randomly joined together. 

m 

S! 15 Each of the forty libraries are produced by joining 

J** together one right half and one left half sublibrary. 

The two sublibraries joined together corresponded to the 
f same column number for right and left half random 

oligonucleotide synthesis. For example, sublibrary 
ill 20 M13IX42.1R is joined with M13IX22.1L to produce the 

S; surface expression library M13IX.1RL. In the alternative 

p situation where only two sublibraries are generated from 

the combined populations of all right half synthesis and 

all left half synthesis, only one surface expression 
25 library would be produced. 

For the random joining of each right and left half 
oligonucleotide populations into a single surface 
expression vector species, the DNAs isolated from each 
sublibrary are digested an excess of Fok I (New England 
3 0 Biolabs) . The reactions are stopped by phenol/chloroform 
extraction, followed by ethanol precipitation. Pellets 
are resuspended in dH 2 0. Each surface expression library 
is generated by ligating equal molar amounts (5-10 pmol) 
of Fok I digested DNA isolated from corresponding right 
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and left half sublibraries in 10 fil of IX ligase buffer 
containing 1 . 0 U of T4 DNA ligase (Bethescla Research 
Laboratories, Gaithersburg, MD) . The ligations proceed 
overnight at 16 °C and are electroporated into the sup O 
5 strain MK30-3 (Boehringer Mannheim Biochemical, (BMB) , 
Indianapolis, IN) as previously described for XL1 cells. 
Because MK30-3 is sup O, only the vector portions 
encoding the randomized oligonucleotides which come 
together will produce viable phage. 

10 Screening of Surface Expression Libraries 



Q Purified phage are prepared from 50 ml liquid 

j4 cultures of XL1 Blue™ cells (Stratagene) which are 



2%i 

y., 



tu infected at a m.o.i. of 10 from the phage stocks stored 

at 4°C. The cultures are induced with 2 mM IPTG. 
15 Supernatants from all cultures are combined and cleared 
H s by two cent rifugat ions, and the phage are precipitated by 

3 adding 1/7.5 volumes of PEG solution (25% PEG- 8000, 2.5 M 

jj. NaCl) , followed by incubation at 4°C overnight. The 

= *f precipitate is recovered by centrif ugation for 90 minutes 

20 at 10,000 x g. Phage pellets are resuspended in 25 ml of 
CI 0.01 M Tris-HCl, pH 7.6,- 1.0 mM EDTA, and 0.1% Sarkosyl 

and then shaken slowly at room temperature for 3 0 
minutes. The solutions are adjusted to 0.5 M NaCl and to 
a final concentration of 5% polyethylene glycol. After 2 
25 hours at 4°C, the precipitates containing the phage are 
recovered by centrif ugation for 1 hour at 15,000 X g. 
The precipitates are resuspended in 10 ml of NET buffer' 
(0.1 M NaCl, 1.0 mM EDTA, and 0.01 M Tris-HCl, pH 7.6), 
mixed well, and the phage repelleted by centrif ugation at 
30 170,000 X g for 3 hours. The phage pellets are 

subsequently resuspended overnight in 2 ml of NET buffer 
and subjected to cesium chloride centrif ugation for 18 
hours at 110,000 X g (3.86 g of cesium chloride in 10 ml 
of buffer) . Phage bands are collected, diluted 7-fold 
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with NET buffer, recentrif uged at 170,000 X g for 3 
hours, resuspended, and stored at 4°C in 0.3 ml of NET 
buffer containing 0.1 mM sodium azide. 



5 Ligand binding proteins used for panning on 

streptavidin coated dishes are first biotinylated and 
then absorbed against UV- inactivated blocking phage (see 
below) . The biotinylating reagents are dissolved in 
dimethylf ormamide at a ratio of 2.4 mg solid NHS-SS- 
10 Biotin (sulf osuccinimidyl 2- (biotinamido) ethyl- 1, 3.' - 

dithiopropionate; Pierce, Rockford, IL) to 1 ml solvent 
and used as recommended by the manufacturer. Small-scale 
i*i reactions are accomplished by mixing 1 fil dissolved 

y!l reagent with 43 /il of 1 mg/ml ligand binding protein 

J: 15 diluted in sterile bicarbonate buffer (0.1 M NaHC0 3/ pH 

Si 8.6). After 2 hours at 25°C, residual biotinylating 

\ h } reagent is reacted with 500 fil 1 M ethanolamine (pH 

adjusted to 9 with HC1) for an additional 2 hours. The 
3 entire sample is diluted with 1 ml TBS containing 1 mg/ml 

ass: 

| : 20 BSA, concentrated to about 50 /xl on a .Centricon 30 ultra- 

til filter (Amicon) , and washed on the same filter three 

}r times with 2 ml TBS and once with 1 ml TBS containing 

p 0.02% NaN 3 and 7 x 10 UV-inactivated blocking phage (see 

below) ; the final retentate (60-80 /xl) is stored at 4°C. 
25 Ligand binding proteins biotinylated with the NHS-SS- 

Biotin reagent are linked to biotin via a disulfide- 

containing chain. 

UV- irradiated M13 phage were used for blocking 
binding proteins which fortuitously bound filamentous 

30 phage in general. M13mp8 (Messing and Vieira, Gene 19: 
262-276 (1982) , which is incorporated herein by 
reference) was chosen because it carries two amber stop 
codons, which ensure that the few phage surviving 
irradiation will not grow in the sup O strains used to 

35 titer the surface expression libraries. A 5 ml- sample 
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containing 5 x 10 13 M13mp8 phage, purified as described 
above, was placed in a small petri plate and irradiated 
with a germicidal lamp at a distance of two feet for 7 
minutes (flux 150 /xW/cm 2 ) . NaN 3 was added to 0.02% and 
5 phage particles concentrated to 10 14 particles/ml on a 
Centricon 30-kDa ultrafilter (Amicon) . 

For panning, polystyrene petri plates (60 x 15 mm, 
Falcon; Becton Dickinson, Lincoln Park, NJ) are incubated 
with 1 ml of 1 mg/ml of streptavidin (BMB) in 0 . 1 M 
10 NaHC0 3 pH 8.6-0.02% NaN 3 in a small, air-tight plastic box 
overnight in a cold room. The next day streptavidin is 
removed and replaced with at least 10 ml blocking 
)B solution (29 mg/ml of BSA; 3 /xg/ml of streptavidin; 0.1 M 

NaHC0 3 pH 8.6-0.02% NaN 3 ) and incubated at least 1 hour at 
15 room temperature. The blocking solution is removed and 
plates are washed rapidly three times with Tris buffered 
saline containing 0.5% Tween 20 (TBS-0.5% Xween 20). 



Hi. z 



lb! 



y s Selection of phage expressing peptides bound by the 

fl? ligand binding proteins is performed with 5 /zl (2.7 /zg 

20 ligand binding protein) of blocked biotinylated ligand 
binding proteins reacted with a 50 fil portion of each 
library. Each mixture is incubated overnight at 4°C, 
diluted with 1 ml TBS-0.5% Tween 20, and transferred to a 
streptavidin-coated petri plate prepared as described 
25 above. After rocking 10 minutes at room temperature, 
unbound phage are removed and plates washed ten times 
with TBS-0.5% Tween 20 over a period of 30-90 minutes. 
Bound phage are eluted from plates with 800 /il sterile 
elution buffer (1 mg/ml BSA, 0.1 M HCl, pH adjusted to 
30 2.2 with glycerol) for 15 minutes and eluates neutralized 
with 48 [il 2 M Tris (pH unadjusted) . A 20 /il portion of 
each eluate is titered on MK30-3 concentrated cells with 
dilutions of input phage. 
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A second round of panning is performed by treating 
750 fil of first eluate from each library with 5 mM DTT 
for 10 minutes to break disulfide bonds linking biotin 
groups to residual biotinylated binding proteins. The 
treated eluate is concentrated on a Centricon 30 
ultrafilter (Amicon) , washed three times with TBS-0.5% 
Tween 20, and concentrated to a final volume of about 50 
111. Final retentate is transferred to a tube containing 
5.0 [il (2.7 fig ligand binding protein) blocked 
biotinylated ligand binding proteins and incubated 
overnight. The solution is diluted with 1 ml TBS-0.5% 
Tween 20, panned, and eluted as described above on fresh 
streptavidin-coated petri plates. The entire second 
eluate (800 jil) is neutralized with 48 fil 2 M Tris, and 
20 //l is titered simultaneously with the first eluate and 
dilutions of the input phage. 

Individual phage populations are purified through 2 
to 3 rounds of plaque purification. Briefly, the second 
eluate titer plates are lifted with nitrocellulose 
filters (Schleicher & Schuell, Inc., Keene, NH) and 
processed by washing for 15 minutes in TBS (10 mM Tris- 
HC1, pH 7.2, 150 mM NaCl) , followed by an incubation with 
shaking for an additional 1 hour at 37°C with TBS 
containing 5% nonfat dry milk (TBS-5% NDM) at 0.5 ml/cm 2 . 
The wash is discarded and fresh TBS-5% NDM is added (0.1 
ml/cm 2 ) containing the ligand binding protein between 1 
nM to 100 mM, preferably between 1 to 100 /zM. All 
incubations are carried out in heat-sealable pouches 
(Sears) . Incubation with the ligand binding protein 
proceeds for 12-16 hours at 4°C with shaking. The 
filters are removed from the bags and washed 3 times for 
3 0 minutes at room temperature with 150 mis of TBS 
containing 0.1% NDM and 0.2% NP-40 (Sigma, St. Louis, 
MO) . The filters are then incubated for 2 hours at room 
temperature in antiserum against the ligand binding 
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protein at an appropriate dilution in TBS-0.5% NDM, 
washed in 3 changes of TBS containing 0.1% NDM and 0.2% 
NP-40 as described above and incubated in TBS containing 
0.1% NDM and 0.2% NP-40 with 1 x 10 6 cpm of 125 I-labeled 
5 Protein A (specific activity = 2.1 x 10 7 cpm//ig) . After 
a washing with TBS containing 0.1% NDM and 0.2% NP-40 as 
described above, the filters are wrapped in Saran Wrap 
and exposed to Kodak X-Omat x-ray film (Kodak, Rochester, 
NY) for 1-12 hours at -70°C using Dupont Cronex Lightning 
10 Plus Intensifying Screens (Dupont, Willmington, DE) . 

Positive plaques identified are cored with the large 
end of a pasteur pipet and placed into 1 ml of SM (5.8 g 



G 

*0 NaCl, 2 g MgS0 4 *7H 2 0, 50 ml 1 M Tris-HCl, pH 7.5, 5 mis 2% 

\I 

gelatin, to 1000 mis with dH 2 0) plus 1-3 drops of CHC1 3 
SI 15 and incubated at 37°C 2-3 hours or overnight at 4°C. The 

phage are diluted 1:500 in SM and 2 [xl are added to 300 
fil of XL1 cells plus 3 mis of soft agar per 100 mm 2 
3 plate. The XL1 cells are prepared for plating by growing 

y t a colony overnight in 10 ml LB (10 g bacto- tryptone , 5 g 

ill 20 bacto-yeast extract, 10 g NaCl, 1000 ml dH 2 0) containing 

100 fil of 20% maltose and 100 /il of 1 M MgS0 4 . The 
bacteria are pelletted by centrif ugation at 2000 xg for 
10 minutes and the pellet is resuspended gently in 10 mis 
of 10 mM MgS0 4 . The suspension is diluted 4-fold by 
25 adding 30 mis of 10 mM MgS0 4 to give an OD 600 of 

approximately 0.5. The second and third round screens 
are identical to that described above except that the 
plaques are cored with the small end of a pasteur pipet 
and placed into 0.5 mis SM plus a drop of CHC1 3 and 1-5 
30 fil of the phage following incubation are used for plating 
without dilution. At the end of the third round of 
purification, an individual plaque is picked and the 
templates prepared for sequencing. 
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Template Preparation and Sequencing 



Templates are prepared for sequencing by inoculating 
a 1 ml culture of 2XYT containing a 1:100 dilution of an 
overnight culture of XL1 with an individual plaque. The 
plaques are picked using a sterile toothpick. The 
culture is incubated at 37°C for 5-6 hours with shaking 
and then transferred to a 1 . 5 ml microfuge tube. 200 (jlI 
of PEG solution is added, followed by vortexing and 
placed on ice for 10 minutes. The phage precipitate is 
recovered by centrif ugation in a microfuge at 12,000 x g 
for 5 minutes. The supernatant is discarded and the 
pellet is resuspended in 230 fxl of TE (10 mM Tris-HCl, pH 
7.5, 1 mM EDTA) by gently pipeting with a yellow pipet 
tip. Phenol (200 /*1) is added, followed by a brief 
vortex and microfuged to separate the phases. The 
aqueous phase is transferred to a separate tube and 
extracted with 200 /il of phenol/chloroform (1:1) as 
described above for the phenol extraction. A 0.1 volume 
of 3 M NaOAc is added, followed by addition of 2.5 
volumes of ethanol and precipated at -20°C for 20 . 
minutes . The precipated templates are recovered by 
centrif ugation in a microfuge at 12,000 x g for 8 
minutes. The pellet is washed in 70% ethanol, dried and 
resuspended in 25 /xl TE. Sequencing was performed using 
a Sequenase™ sequencing kit following the protocol 
supplied by the manufacturer (U.S. Biochemical, 
Cleveland, OH) . 
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EXAMPLE II 



Isolation and Characterization of Peptide Ligands 
Gene jE^rtj^ Oligonucleotides Having Random Codons at Two 

Predetermined Positions 




5 



This example shows the generation of a surface 



expression library from a population of oligonucleotides 
having randomized codons. The oligonucleotides are ten 
codons in length and are cloned into a single vector 
species for the generation of a M13 gene VHI-based 
10 surface expression library. .The example also shows the 
selection o£ peptides for a ligand binding protein and 
characterization of their encoded nucleic acid sequences. 

01 igonuc leot ide Synt he sis 



15 Example I. The synthesizer was programmed to synthesize 
the . sequences shown in Table IX. These sequences 
correspond to the first random codon position synthesized 
and 3 * flanking sequences of the oligonucleotide which 
hybridizes to the leader sequence in the vector. The 

2 0 complementary sequences are used for insert ional 
mutagenesis of the synthesized population of • 
oligonucleotides . 



Oligonucleotides were synthesized as described in 



Table IX 



Column 



Sequence (5 ' to 3 1 ) 



30 



25 



column 5 



column 6 



column 3 



column 4 



column 2 



column 1 



AA (A/C) GGTTGGTCGGTACCGG 
AG (A/G) GGTTGGTCGGTACCGG 
AT (A/G) GGTTGGTCGGTACCGG 
AC (A/G) GGTTGGTCGGTACCGG 
CA (G/T) GGTTGGTCGGTACCGG 
CT (G/C) GGTTGGTCGGTACCGG 



column 7 
column 8 
column 9 
column 10 
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AG (T/C) GGTTGGTCGGTACCGG 
AT (T/C) GGTTGGTCGGTACCGG 
CC (A/C) GGTTGGTCGGTACCGG 
T (A/T) TGGTTGGTCGGTACCGG 



The next eight random codon positions were 
synthesized as described for Table V in Example I . 
Following the ninth position synthesis, the reaction 
products were once more combined, mixed and redistributed 
into 10 new reaction columns. Synthesis of the last 
random codon position and 5' flanking sequences are shown 
in Table X. 

Table X 



Column 




Sequence (5' to 3 ' ) 


column 


1 


. - AGGATCCGCCGAGCTCAA (A/C) A 


column 


2 


AGGATCCGCCGAGCTCAG ( A/G) A 


column 


3 


AGGATCCGCCGAGCTCAT (A/G) A 


column 


4 


AGGATCCGCCGAGCTCAC (A/G) A 


column 


5 


AGGATCCGCCGAGCTCCA (G/T) A 


column 


6 


AGGATCCGCCGAGCTCCT (G/C) A 


column 


7 


AGGATCCGCCGAGCTCAG (T/C) A 


column 


8 


AGGATCCGCCGAGCTCAT (T/C) A 


column 


9 


AGGATCCGCCGAGCTCCC (A/C) A 


column 


10 


AGGATCCGCCGAGCTCT (A/T) TA 



The reaction products were mixed once more and the 
oligonucleotides cleaved and purified as recommended by 
the manufacturer. The purified population of 
oligonucleotides were used to generate a surface 
expression library as described below. 



Vector Construction 



The vector used for generating surface expression 
libraries from a single oligonucleotide population (i.e., 
without joining together of right and left half 
oligonucleotides) is described below. The vector is a 
M13 -based expression vector which directs the synthesis 
of gene VI I I -peptide fusion proteins (Figure 4) . This 
vector exhibits all the functions that the combined right 
and left half vectors of Example I exhibit. 

An M13 -based vector was constructed for the cloning 
and surface expression of populations of random 
oligonucleotides (Figure 4, M13IX30) , M13mpl9 (Pharmacia) 
was the starting vector. This vector was modified to 
contain, in addition to the encoded wild type M13 gene 
VIII: (1) a pseudo-wild type gene, gene VIII sequence 
with an amber stop codon placed between it and the 
restriction sites for cloning oligonucleotides; (2) Stu 
I, Spe I and Xho I restriction sites in frame with the 
pseudo-wild type gVIII for cloning oligonucleotides; (3) 
sequences necessary for expression, such as a promoter, 
signal sequence and translation initiation signals; (4) 
various other mutations to remove redundant restriction 
sites and the amino terminal portion of Lac Z. 

Construction of M13IX30 was performed in four steps. 
In the first step, a precursor vector containing the 
pseudo gene VIII and various other mutations was 
constructed, M13IX01F. The second step involved the 
construction of a small cloning site in a separate 
M13mpl8 vector to yield M13IX03. In the third step, 
expression sequences and cloning sites were constructed 
in M13IX03 to generate the intermediate vector M13IX04B. 
The fourth step involved the incorporation of the newly 
constructed sequences from the intermediate vector into 
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M13IX01F to yield M13IX30. Incorporation of these 
sequences linked them with the pseudo gene VIII. 

Construction of the precursor vector M13IX01F was 
similar to that of M13IX42 described in Example I except 
5 for the following features: (1) M13mpl9 was used as the 
starting vector; (2) the Fok I site 5 ! to the unique Eco 
RI site was not incorporated and the overhang at the 
naturally occurring Fok I site at position 3547 was not 
changed to 5'-CTTC-3'; (3) the spacer sequence was not 
10 incorporated between the Eco RI and Sac I sites; and (4) 
the amber codon at position 44 92 was not incorporated. 

In the second step, M13mpl8 was mutated to remove 
f|i the 5' end of Lac Z up to the Lac i binding site and 

Si including the Lac Z ribosome binding site and start 

■"^ 15 codon. Additionally, the polylinker was . removed and a 

U, Mlu I site was introduced in the coding region of Lac Z. 

3 A single oligonucleotide was used for these mutagenesis 

and had the sequence "5'- 
\U AAACGACGGCCAGTGCCAAGTGACGCGTGTGAAATTGTTATCC-3 ' " (SEQ ID 

f»| 20 NO: 41) . Restriction enzyme sites for Hind III and Eco 

Q RI were introduced downstream of the Mlul site using the 

oligonucleotide n 5 ' - 

GGCGAAAGGGAATTCTGCAAGGCGATTAAGCTTGGGTAACGCC- 3 ' » (SEQ ID 
NO: 42) . These modifications of M13mpl8 yielded the 
25 vector M13IX03 . 

The expression sequences and cloning sites were 
introduced into M13IX03 by chemically synthesizing a 
series of oligonucleotides which encode both strands of 
the desired sequence. The oligonucleotides are presented 
30 in Table XI (SEQ ID NOS : 43 through 50) . 
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TABLE XI 
M13IX30 Oligonucleotide Series 

Top Strand 

Oligonucleotides Sequence (5' to 3 ' ) 
5 084 GGCGTTACCCAAGCTTTGTACATGGAGAAAATAAAG 

027 TGAAACAAAGCACTATTGCACTGGCACTCTTACCGT 
TACCGT 

028 TACTGTTTACCCCTGTGACAAAAGCCGCCCAGGTCC 
AGCTGC 

10 029 TCGAGTCAGGCCTATTGTGCCCAGGGATTGTACTAG 

TGGATCCG 

?H Bottom 

Si O 1 i gonuc leotides Sequence (5' to 3') 

fl! 

Si 085 

W 1 5 TGGCGAAAGGGAATTCGGATCCACTAGTACAATCCCTG 

031 

3 GGCACAATAGGCCTGACTCGAGCAGCTGGACCAGGGCG 
U> GCTT 

5 032 

2 0 TTGTCACAGGGGTAAACAGTAACGGTAACGGTAAGTGT 

GCCA 



: 

a 



033 

GTGCAATAGTGCTTTGTTTCACTTTATTTTCTCCATGT 
ACAA 



25 The above oligonucleotides except for the terminal 

oligonucleotides 084 (SEQ ID NO: 43) and 085 (SEQ ID NO: 
47) of Table XI were mixed, phosphorylated, annealed and 
ligated to form a double stranded insert as described in 
Example I. However, instead of cloning directly into the 

30 intermediate vector the insert was first amplified by PCR 
using the terminal oligonucleotides 084 (SEQ ID NO: 43) 
and 085 (SEQ ID NO: 47) as primers. The terminal 
oligonucleotide 084 (SEQ ID NO: 43) contains a Hind III 
site 10 nucleotides internal to its 5 1 end. 
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Oligonucleotide 085 (SEQ ID NO: 47) has an Eco RI site at 
its 5 ! end. Following amplification, the products were 
restricted with Hind III and Eco RI and ligated as 
described in Example I into the polylinker of M13mpl8 
5 digested with the same two enzymes. The resultant double 
stranded insert contained a ribosome binding site, a 
translation initiation codon followed by a leader 
sequence and three restriction enzyme sites for cloning 
random oligonucleotides (Xho I, Stu I, Spe I). The 
10 vector was named M13IX04. 

During cloning of the double -stranded insert, it was 
M found that one of the GCC codons in oligonucleotides 028 

€l and its complement in 031 was deleted. Since this 

jy deletion. did not affect function, the final construct is 

S! 15 missing one of the two GCC codons. Additionally, 

?J- oligonucleotide 032 contained a GTG codon where a GAG 

codon was needed. Mutagenesis was performed using the 
3 oligonucleotide 5 ' -TAACGGTAAGAGTGCCAGTGC-3 ' (SEQ ID NO: 

M: 

y t 51) to convert the codon to the desired sequence. The 

I'll 20 resultant intermediate vector was named M13IX04B. 

a 

Q The fourth step in constructing M13IX30 involved 

inserting the expression and cloning sequences from 
M13IX04B upstream of the pseudo-wild type gVIII in 
M13IX01F. This was accomplished by digesting M13IX04B 
25 with Dra III and Ban HI and gel isolating the 700 base 
pair insert containing the sequences of interest. 
M13IX01F was likewise digested with Dra III and Bam HI. 
The insert was combined with the double digested vector 
at a molar ratio of 3:1 and ligated as described in 
30 Example I. It should be noted that all modifications in 
the vectors described herein were confirmed by sequence 
analysis. The sequence of the final construct, M13IX30, 
is shown in Figure 7 (SEQ ID NO: 3) . Figure 4 also shows 
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M13IX3 0 where each of the elements necessary for surface 
expression of randomized oligonucleotides is marked. 

Library Construction, Screening and Characterization of 
Encoded Oligonucleotides 



5 Construction of an M13IX30 surface expression 

library is accomplished identically to that described in 
Example I for sublibrary construction except the 
oligonucleotides described above are inserted into 
M13IX30 by mutagenesis instead of by ligation. The 
10 library is constructed and propagated on MK30-3 (BMB) and 
{2 phage stocks are prepared for infection of XLI cells and 

*j screening. The surface expression library is screened 

and encoding oligonucleotides characterized as described 
in Example I . 



N= 15 EXAMPLE III 

3 



Isolation and Characterization of Peptide Ligands 
Generated from Right and Left Half 



p[ Degenerate Oligonucleotides 



This example shows the construction and expression 
20 of a surface expression library of degenerate 

oligonucleotides. The encoded peptides of this example 
derive from the mixing and joining together of two 
separate oligonucleotide populations. Also demonstrated 
is the isolation and characterization of peptide ligands 
25 and their corresponding nucleotide sequence for specific 
binding proteins. 
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Synthesis of Oligonucleotide Populations 

A population of left half degenerate 
oligonucleotides and a population of right half 
degenerate oligonucleotides was synthesized using 
standard automated procedures as described in Example I. 

The degenerate codon sequences for each population 
of oligonucleotides were generated by sequentially 
synthesizing the triplet NNG/T where N is an equal 
mixture of all four nucleotides. The antisense sequence 
for each population of oligonucleotides was synthesized 
and each population contained 5' and 3' flanking 
sequences complementary to the vector sequence. The 
complementary termini was used to incorporate each 
population of oligonucleotides into their respective 
vectors by standard mutagenesis procedures. Such 
procedures have been described previously in Example I 
and in the Detailed Description. . Synthesis of the 
antisense sequence of each population was necessary since 
the single-stranded form of the vectors are obtained only 
as the sense strand. 

The left half oligonucleotide population was 
synthesized having the following sequence: 5'- 
AGCTCCCGGATGCCTCAGAAGATG ( A/CNN) 9 GGCTTTTGCCACAGGGG- 3 1 ( SEQ 
.ID NO: 52). The right half oligonucleotide population 
was synthesized having the following sequence: 5 1 - 
CAGCCTCGGATCCGCC (A/CNN) 10 ATG ( A/C) GAAT- 3 » (SEQ ID NO. 53). 
These two oligonucleotide populations when incorporated 
into their respective vectors and joined together encode 
a 20 codon oligonucleotide having 19 degenerate positions 
and an internal predetermined codon sequence. 



Vector Construction 



Modified forms of the previously described vectors 
were used for the construction of right and left half 
sublibraries . The construction of left half sublibraries 
was performed in an M13 -based vector termed M13ED03 . 
This vector is a modified form of the previously 
described M13IX30 vector and contains all the essential 
features of both M13IX30 and M13IX22. M13ED03 contains, 
in addition to a wild type and a pseudo-wild type gene 
VIII, sequences necessary for expression and two Fok I 
sites for joining with a right half oligonucleotide 
sublibrary. Therefore, this vector combines the 
advantages of both previous vectors in that it can be 
used for the generation and expression of surface 
expression libraries from a single oligonucleotide 
population or it can be joined with a sublibrary to bring 
together right and left half oligonucleotide populations 
into a surface expression library. 

M13ED03 was constructed in two steps from M13IX30. 
The first step involved the modification of M13IX30 to 
remove a redundant sequence and to incorporate a sequence 
encoding the eight amino-terminal residues of human S- 
endorphin. The leader sequence was also mutated to 
increase secretion of the product. 

During construction of M13IX04 (an intermediate 
vector to M13IX3 0 which is described in Example II) , a 
six nucleotide sequence was duplicated in oligonucleotide 
027 (SEQ ID NO: 44) and its complement 032 (SEQ ID NO: 
49). This sequence, 5 1 -TTACCG-3 ' , was deleted by 
mutagenesis in the construction of M13ED01. The 
oligonucleotide used for the mutagenesis was 5'- 
GGTAAACAGTAACGGTAAGAGTGCCAG-3 1 (SEQ ID NO: 54). The 
mutation in the leader sequence was generated using the 



oligonucleotide 5 1 -GGGCTTTTGCCACAGGGGT-3 ' (SEQ ID NO: 
55) . This mutagenesis resulted in the A residue at 
position 6353 of M13IX30 being changed to a G residue. 
The resultant vector was designated M13IX32. 

To generate M13ED01, the nucleotide sequence 
encoding S-endorphin (8 amino acid residues of 6- 
endorphin plus 3 extra amino acid residues) was 
incorporated after the leader sequence by mutagenesis. 
The oligonucleotide used had the following sequence: 5' 
AGGGTCATCGCCTTCAGCTCCGGATCCCTCAGAAGTCATAAACCCCCCATAGGC 
TTTTGCCAC-3* (SEQ ID NO: 56). This mutagenesis also 
removed some of the downstream sequences through the Spe 
I site. 

The second step in the construction of M13ED03 
involved vector changes which put the ^-endorphin 
sequence in frame with the downstream pseudo-gene VIII 
sequence and incorporated a Fok I site for joining with 
sublibrary of right half oligonucleotides. This vector 
was designed to incorporate oligonucleotide populations 
by mutagenesis using sequences complementary to those 
flanking or overlapping with the encoded ^-endorphin 
sequence. The absence of S-endorphin expression after 
mutagenesis can therefore be used to measure the 
mutagenesis frequency. In addition to the above vector 
changes, M13ED03 was also modified to contain an amber 
codon at position 3262 for biological selection during 
joining of right and left half sublibraries . 

The mutations were incorporated using standard 
mutagenesis procedures as described in Example I. The 
frame shift changes and Fok I site were generated using 
the oligonucleotide 5 1 - 

TCGCCTTCAGCTCCCGGATGCCTCAGAAGCATGAACCCCCCATAGGC - 3 ' (SEQ 
ID NO: 57) . The amber codon was generated using the 



oligonucleotide 5 ' -CAATTTTATCCTAAATCTTACCAAC-3 » (SEQ ID 
NO: 58) . The full sequence of the resultant vector, 
M13ED03, is provided in Figure 8 (SEQ ID NO: 4) . 

The construction of right half oligonucleotide 
sublibraries was performed in a modified form of the 
M13IX42 vector. The new vector, M13IX421, is identical 
to M13IX4 2 except that the amber codon between the Eco 
Rl-SacI cloning site and the pseudo-gene VIII sequence 
was removed. This change ensures that all expression off 
of the Lac Z promoter produces a peptide-gene VIII fusion 
protein. Removal of the amber codon was performed by 
mutagenesis using the following oligonucleotide: 5'- 
GCCTTCAGCCTCGGATCCGCC-3 ' (SEQ ID NO: 59). The full 
sequence of M13IX421 is shown in Figure 9 (SEQ ID NO: 5) . 

Library Construction, Screening and Characterization of 
Encoded Oligonucleotides 

A sublibrary was constructed for each of the 
previously described degenerate populations of 
oligonucleotides. The left half population of 
oligonucleotides was incorporated into M13ED03 to 
generate the sublibrary M13ED03.L and the right half 
population of oligonucleotides was incorporated into 
M13IX421 to generate the sublibrary M13IX421.R. Each of 
the oligonucleotide populations were incorporated into 
their respective vectors using site-directed mutagenesis 
as described in Example I. Briefly, the nucleotide 
sequences flanking the degenerate codon sequences were 
complementary to the vector at the site of incorporation. 
The populations of nucleotides were hybridized to single- 
stranded M13ED03 or M13IX421 vectors and extended with T4 
DNA polymerase to generate a double -stranded circular 
vector. Mutant templates were obtained by uridine 
selection in vivo , as described by Kunkel et al . , supra . 
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Each of the vector populations were electroporated into 
host cells and propagated as described in Example I. 

The random joining of right and left half 
sublibraries into a single surface expression library was 
5 accomplished as described in Example I except that prior 
to digesting each vector population with Fok I they were 
first digested with an enzyme that cuts in the unwanted 
portion of each vector. Briefly, M13ED03.L was digested 
with Bgl II (cuts at 7094) and M13IX421.R was digested 
10 with Hind III (cuts at 3919) . Each of the digested 
populations were further treated with alkaline 
p phosphatase to ensure that the ends would not religate 

€l and then digested with an excess of Fok I. Ligations, 

l{\ electroporation and propagation of the resultant library 

Si 15 was performed as described in Example I. 



=r=c? 



The surface expression library was screened for 
ligand binding proteins using a modified panning 
procedure. Briefly, 1 ml of the library, about 10 12 phage 
2 0 particles, was added to 1-5 /ig of the ligand binding 
protein. The ligand binding protein was either an 
| antibody or receptor globulin (Rg) molecule, Aruffo et 

al., Cell 61:1303-1313 (1990), which is incorporated 
herein by reference. Phage were incubated shaking with 
25 affinity ligand at room temperature for 1 to 3 hours 
followed by the addition of 200 /xl of latex beads 
(Biosite, San Diego, CA) which were coated with goat- 
antimouse IgG. This mixture was incubated shaking for an 
additional 1-2 hours at room temperature. Beads were 
30 pelleted for 2 minutes by centrif ugation in a microfuge 
and washed with TBS which can contain 0.1% Tween 20. 
Three additional washes were performed where the last 
wash did not contain any Tween 20. The bound phage were 
then eluted with 200 /xl 0.1 M Glycine-HCl, pH 2 . 2 for 15 
- 35 minutes and the beads wei^ spun down by centrif ugation . 
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10 



The supernatant -containing phage (eluate) was removed and 
phage exhibiting binding to the ligand binding protein 
were further enriched by one -to -two more cycles of 
panning. Typical yields after the first eluate were 
5 about 1 x 10 6 - 5 x 10 6 pfu. The second and third eluate 
generally yielded about 5 x 10 6 - 2 x 10 7 pfu and 5 x 10 7 
- 1 x 10 10 pfu, respectively. 

The second or third eluate was plated at a suitable 
density for plaque identification screening and 
sequencing of positive clones (i.e., plated at confluency 
for rare clones and 200-500 plaques/plate if pure plaques 
were needed). Briefly, plaques grown for about 6 hours 
at 37°C and were overlaid with nitrocellulose filters 
that had been soaked in 2 mM IPTG and then briefly dried. 
15 The filters remained on the plaques overnight at room 

temperature, removed and placed in blocking solution for 
1-2 hours. Following blocking, the filters were 
incubated in 1 fxg/ml ligand binding protein in blocking 
solution for 1-2 hours at room temperature. Goat 
20 antimouse Ig-coupled alkaline phosphatase (Fisher) was 
1 added at a 1:1000 dilution and the filters were rapidly 
washed with 10 mis of TBS or block solution over a glass 
vacuum filter. Positive plaques were identified after 
alkaline phosphatase development for detection. 
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Screening of the degenerate oligonucleotide library 
with several different ligand binding proteins resulted 
in the identification of peptide sequences which bound to 
each of the ligands. For example, screening with an 
antibody to S-endorphin resulted in the detection of 
30 about 30-40 different clones which essentially all had 
the core amino acid sequence known to interact with the 
antibody. The sequences flanking the core sequences were 
different showing that they were independently derived 
and not duplicates of the same clone. Screening with an 
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antibody known as 57 gave similar results (i.e a core 
consensus sequence was identified but the flanking 
sequences among the clones were different) 

EXAMPLE TV 

5 ^^^^eft Half Random ni± am ^^ 

This example shows the synthesis and construction of 
a left half random oligonucleotide library. 

A population of random oligonucleotides nine codons 
m length was synthesized as described in Example I 

0 except that different sequences at their 5 ■ and 3 • ends 
were synthesized so that they could be easily inserted 
into the vector by mutagenesis. Also, the mixing and 
dividing steps for generating random distributions of 
reaction products was performed by the alternative method 

- of dispensing equal volumes of bead suspensions The 
liquid chosen that was dense enough for the beads to 
remain dispersed was 100% acetonitrile . 

Briefly, each column was prepared for the first 
coupling reaction by suspending 22 mg (i Mmo le) of 48 
Mmol/g capacity beads . (Genta, San Diego, CA) in 0 . 5 mis 
of 100% acetonitrile. These beads are smaller than those 
described in Example I and are derivatized with a guanine 
nucleotide. They also do not have a controlled pore 
size. The bead suspension was then transferred to an 
empty reaction column. Suspensions were kept relatively 
dispersed by gently pipetting the suspension during 
transfer. Columns were plugged and monomer coupling 
reactions were performed as shown in Table XII. 
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Column 

column 1L 
column 2L 
column 3L 
column 4L 
column 5L 
column 6L 
column 7L 
column 8L 
column 9L 
column 10L 



Sequence 
(5 ' l-o 7 <\ 



AA (A/C) GGCTTTTGCCACAGG 
AG (A/G) GGCTTTTGCCACAGG 
AT (A/G) GGCTTTTGCCACAGG 
AC (A/G) GGCTTTTGCCACAGG 
CA (G/T) GGCTTTTGCCACAGG 
CT (G/C) GGCTTTTGCCACAGG 
AG (T/C) GGCTTTTGCCACAGG 
AT (T/C) GGCTTTTGCCACAGG 
CC (A/C) GGCTTTTGCCACAGG 
T (A/T) TGGCTTTTGCCACAGG 
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n AftSr C ° UPling ° f the last corner, the columns were 
- un P lu gged as described previo usl y , and their CQntent ™ 
Poured lnto a ,.5 ml microfuge ^ ^ ere 

b al W ^ 10 ° % aCet ° ni - ile - -cover any remaining 

that th ^ T 1UmS US8d rinSing W - ^^ined bo 

that he f laal volume of tQtal bead suspension 

L ai' .IT I 90 ' reaCti ° n C01 ^ the beads woul 

be al iquo ted lnto . The mixture wag 

Produce a uniformly dispersed suspension and then 

eauTvol With C ° nStant PiPStting ° f the »^ure. into ' 
equal volumes. Each mixture of beads was then 

transferred to an empty reaction column. The empty tubes 
were washed with a ™n „„, y CUDes 

also tran=f ! 0£ 100% ^^rile and 

also transferred to their respective colons. Random 
codon positions 3 throu g h 9 were then synthesis as 
described ln Exa mpl e I where the mixi „ 9 and . dividlng 
steps were performed using a suspension in 100 % 
acetonitrile. The coupling reactions for codon positions 
2 through 9 are shown in Table XIII. 
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Column 



Sequence 
(5 ' to 3 1 ) 



column 6L 



column 10L 



column 9L 



column 8L 



column 7L 



column 5L 



column 1L 



column 4L 



column 3L 



column 2L 



AA (A/C) A 
AG (A/G) A 
AT (A/G) A 
AC (A/G) A 
CA (G/T) A 
CT(G/C)A 
AG (T/C) A 
AT(T/C)A 
CC(A/C) A 
T ( A/T) TA 



After coupling of the last monomer for the ninth 
codon position, the reaction products were mixed and a 
portion was transferred to an empty reaction column. 
Columns were plugged and the following monomer coupling 
reactions were performed: 5 ' -CGGATGCCTCAGAAGCCCCXXA-3 1 
(SEQ ID NO: 60) . The resulting population of random 
oligonucleotides was purified and incorporated by 
mutagenesis into the left half vector M13ED04 . 

M13ED04 is a modified version of the M13ED03 vector 
described in Example III and therefore contains all the 
features of that vector. The difference between M13ED03 
and M13ED04 is that M13ED04 does not contain the five 
amino acid sequence (Tyr Gly Gly Phe Met) recognized by 
anti-S-endorphin antibody. This sequence was deleted by 
mutagenesis using the oligonucleotide 5'- 
CGGATGCCTCAGAAGGGCTTTTGCCACAGG (SEQ ID NO: 61) . The 
entire nucleotide sequence of this vector is shown in 
Figure 10 (SEQ ID NO: 6) . 
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Although the invention has been described with 
reference to the presently preferred embodiment, it 
should be understood that various modifications can be 
made without departing from the spirit of the invention. 
Accordingly, the invention is limited only by the claims. 



